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TOXINS ACTIVE AGAINST OSTRINIA NUBILAUS 

pyAprfiimd of the. Invention 
The soil microbe Badllus thuringiensis (B.t.) is a Gram-positive, spore-forming 
bactcrimn. Most strains of Ar. do not exhibit pesticidal activity. Some B.t strains produce, and 
can be characterized by. parasporal crystalline protein inclusions. Hiese «6-endotoxins" a« 
different fix,m exotoxins, which have a non-specific host range. lUese inclusions often appear 
microscopically as distinctively shaped crystals, llie proteins can be highly toxic to pests and 
specific in their toxic activity. Certain B.t. toxin genes have been isolated and sequenced, and 
recombinant DNA-based B. t. products have been produced and approved for use. In addition, 
with the use of genetic engineering techniques, new approaches for delivering B.t. toxins to 
agricultural enviromnents are under development, including the use of plants genetically 
engineer^ with5.t toxin genes for insect resistance and the use of stabilized intact mi^^^^ 

cells as B.t. toxin delivery vehicles (Gaerlner. F B.. L. Kim [1988] HBTECH 6:S4-S7). Hius. 
isolated B.t. endotoxin genes are becoming commercially valuable. 

Until the last fifteen years, commercial use of B./. pesticides has been largely restricted 
to a narrow range of lepidopteran (caterpillar) pests. Preparations of the spores and crystals of 
B. thuringiensis subsp. kurstaM have been used for many years as commercial insecticides for 
lepidopteran pests. For example, B. thuringiensis var. kurstoM HD-1 produces a crystalline 6- 

endotoxin which is toxic to the larvae of anumber of lepidopteran insects. 

Inrecentyeais,howcver, im^stigators have discoveredB.^ pesticides with specificiti^ 

for a much broader range of pests. For example, other species of B.t., namely israelensis and 
„arris<nU{^\i.^.tenebrionis,2^B.t.M-7,^k.^.B.L 

to control insects of the orders Diptera and Coleoptera, respectively (Gaertner, F JI. [1989] 
"Cellular DeliVerySystems for fasectiddalProteins: Living and Non-LivingMiaoorganisms." 
in Controlled Delivery of Crop Protection Agents, R.M. WiUdns. ed.. Taylor and Francis. New 
York and London. 1990. pp. 245-255.). See also Couch. TX. (1980) "Mosquito Pathogenicity 
of BacUhis thuringiensis var. israelensis," Devebpments in Industrial Microbiology 22:61-76; 
and Beegle, C.C. (1978) "Use of Entomogenous Bacteria in Agroecosystems," Developments 
in Industrial Microbiology 20:97-104. Krieg. A.. A.M. Huger. G.A. Langenbruch, W. 
Schnetter (1983) Z. ang. Ent. 96:500-508 describe Badlhs thuringiensis var. tenebrionis, which 
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is reportedly active against two beetles in the order Coleoptera. These are Ae Colorado potato 
beetle, Lepttnotarsa decemlineata, and Agelastica edni. 

Recently, new subspecies of B./. have been identified, and genes responsible for active 
6-endotoxin proteins have been isolated (H6fte. H., H.R. Whiteley [1989] Microbiological 
Reviews 52(2):242-255). H6fte and Whiteley classifiedS.t crystal protein genes into four major 
classes. The classes were Qyl (Lepidoptera-specific), CryH (Lepidoptera- and Diptera-speciftc), 
Cryffl (Coleoptera-specific), and CtylV (Diptera-specific). The discovery of sbains specifically 
toxic to other pests has been reported (Feitelson, J.S., J. Payne, L.Kim [1992] Bio/Technology 
10:271-275). CryV has been pr<q>osed to designate a class of toxin genes that are nematode- 
specific. Lambert et al. (Lambert. B., L. Buysse, C. Decock, S. Jansens, C. Piens. B. Saey, J. 
Seurinck, K. van Audenhove, J. Van Rie, A. Van Vliet, M. Pefcrocn [1996] Appl. Environ. 
Microbiol 62(I):80-86) and Shevelev et al. (1X993] FEES Lett. 336:79-82) describe the 
characterization of Cry9 toxins active against lepidopterans. Published PCT applications WO 
94/05771 and WO 94/24264 also describe B.t isolates active against lepidopteran pests. Gleave 
et al. ([1991] JGM 138:55-62) and Smulevitch et al. ([1991] FEES Lett. 293:25-26) also 
describe B.t. toxins. A number of other classes of B.t. genes have now been identified. 

The cloning and expression of a B.t. crystal protein gene in Escherichia coli has been 
describedin the pubUshed literature (Schnepf, H.E., HJL Whiteley [1981] Proc. Natl. Acad. ScL 
USA 78:2893-2897.). U.S. Patent 4,448.885 and U.S. Patent 4,467,036 both disclose the 
expression of B.t. crystal protein in E. coli. U.S. Patents 4,990,332; 5,039,523; 5,126,133; 
5,164,180; and 5,169,629 are among those which disclose B.t. toxins having activity against 
lepidopterans. PCT application WO96/05314 discloses PS86W1. PS86V1, and other B.t. 
isolates active against lepidopteran pests. The PCT patent applications published as 
W094/24264 and WO94/05771 describe B.t. isolates and toxins active agstinst lepidopteran 
pests. B.t. proteins with activity against members of the femily Noctuidae are described by 
Lambert et al, supra. U.S. Patents 4,797,276 and 4,853,331 disclose B. tHuringiensis stmn 
tenebrionis which can be used to control coleopteran pests in various environments. U.S. Patent 
No. 4,918,006 discloses B.t. toxins having activity against dipterans. U.S. Patent No. 5,151,363 
and U.S. Patent No. 4,948,734 disclose certain isolates of B.t. which have activity against 
riematodes. Other U.S. patents which disclose activity against nematodes include 5,093,120; 
5,236,843; 5,262,399; 5,270,448; 5,281,530; 5,322,932; 5^50,577; 5,426,049; and 5,439,881. 
As a result of extensive research and investment of resources, other patents have issued for new 
B.t. isolates and new uses of B.t. isolates. See Feitelson et al., supra, for a review. However, 
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the discovery of new B.t. isolates and new uses of known B.t. isolates ranains an enq)irical, 
unpredictable art. 

Isolating responsible toxin genes has been a slow empirical process. Carozzi et ah 
(Carozzi, N.B., V.C. Kramer, G.W. Warren. S. Evola, G. Koziel (1991) Appl. Env. Microbiol. 
57{l 1):3057-3061) describe methods for identifying nove B.t. isolates. This report does not 
disclose or suggest the specific primers, probes, toxins, and genes of the subject invention for 
lepidopteran-active toxin genes. U.S. Patent No. 5,204,237 describes specific and universal 
probes for the isolation of B.t. toxin genes. This patent, however, does not describe the probes, 
primers, toxins, and genes of the subject invention. 

WO 94/21795 and Estruch, J.J. et al. ([1996] PNAS 93:5389-5394) describe toxins 
obtained from Bacillus microbes. These toxins are reported to be produced during vegetative 
cell growth and were thus termed vegetative insecticidal proteins (VIP). These toxins were 
reported to be distinct from crystal-forming 6-endotoxins. Activity of these toxins against 
lepidopteran pests was reported. 

Black cutworm {Agrotis ipsilon (Hufiiagel); Lepidoptera: Noctuidae) is a serious pest 
of many crops including maize, cotton, cole crops (Brassica, broccoli, cabbages, Chinese 
cabbages), and turf. Secondary host plants include beetroots, Capsicum (peppers), chickpeas, 
faba beans, lettuces, lucerne, onions, potatoes, radishes, rape (canola), rice, soybeans, 
strawberries, sugarbeet, tobacco, tomatoes, and forest trees. In North America, pests of the 
genus Agro^ feed on clover, com, tobacco, hemp, onion, strawberries, blacld)erries, raspbemes, 
alfalfa, bariey, beans, cabbage, oats, peas, potatoes, sweetpotatoes, tomato, garden flowers, 
grasses, lucerne, maize, asparagus, grapes, almost any kind of leaf, weeds, and many other crops 
and garden plants. Other cutwwms in the Tribe Agrotini arc pests, in particular those in Ae 
genus FdUa (e.g., F.jaculifera (Guenfe); equivalent to ducens subgothica) and Euxoa {e.g„ E. 
messoria (Harris), E. scandens (Riley), E. auxiliaris Smith. K detersa (Walker), £. tessellata 
(Harris), E. ochrogaster (Guen6e). Host plants include various crops, including rape. 

Cutworms are also pests outside Norfli America, and the more economically significant 
pests attack chickpeas, wheat, vegetables, sugarbeet, lucerne, maize, potatoes, turnips, rape, 
lettuces, strawberries, loganbaries, flax, cotton, soybeans, tobacco, beetroots, Chinese cabbages, 
tomatoes, auba^es, sugarcane, pastures, cabbages, groundnuts, Cucurbita, turnips, sunflowars, 
Brassica, onions, leeks, celery, sesame, asparagus, rhubarb, chicory, grerahouse crops, and 
spinach. The black cutworm A. ipsilon occurs as a pest outside North America, including 
Central America, Europe, Asia. Australasia, Africa, India, Taiwan, Mexico, Egypt, and New 
Zealand. 
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Cutworrnsprogresslhroughseveralinsfarsas larvae. Although seedling cutting by later 

instar larvae produces fee most obvious damage and economic loss, leaf feeding commonly 
r^sdts in yield loss in crops such as maize. Upon reaching the fourth larval instar.l^^^ 

to cut plants and plant parts. especiaUy seedlings. Because of the shift in feeding behavior, 
economically damaging populations may build up unexpectedly with few early warning signs. 
H^r nocturnal habit and behavior of burrowing into the ground also makes detection 
problematic. Large cutworms can destroy several seedlings per day. and a heavy infestation can 

remove entire stands of crops. 

Odtoal controls for A. ipsilon such as peripheral weed control can help prevent heavy 
infestations; however, such methods are not always feasible or effective. Infestations are very 
sporadic, and applying an insectiddepriortoplanting or atplantinghasnotbeeneff^^^^ 
past Some baits are avaiUble for control of cutworms in crops. To protect tirfgmss such as 
creeping bentgrass. chemical insecticides have been employed. Use of chemical pesticides is 
a particular concern in turf because of the close contact the public has with treated areas (e.g., 
golf greens, athletic fields, parks and other recreational areas, professional landscaping, home 
lawns). Natural products (e.g., nematodes, azadirachtin) generally perform poorly. To date. 
Bacaius thuringiensis products have not been widely used to control black cutworm because 
highly effective toxins have not been available. 

RfjAf ft^pumarv of the Invention 
The subject invention concerns materials and methods useful in the control of non- 
n^ian pests and. particularly, plant pests, a specific embodiment, the subject invention 
provides new toxins usefi.1 for the control of lepidopterans. m a particularly preferred 
embodiment, the toxms of the subject invention aie used to control black cutworm. The subject 
invention further provides nucleotide sequences which encode the lepidopteran-active toxins of 
thesubjectinvention. The subject invention fi»1her provides nucleotide sequences and methods 
usefiil in the identification and characterization of genes which encode pesticidal toxins. The 
8ul^ectinventionfiirtherprovidesnew5ad//«.<temp^^ 

In one embodiment, the subject invention concerns unique nucleotide sequences which 
are usefiil as primers in PCR techniques. The primers produce characteristic gene firagments 
which can be used in the identification and isolation of specific toxin genes. The nucleotide 
sequences of the subject invention encode toxins which are distinct bom previously-described 
5-endotoxins. 
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to <« embodlmen. of ft. s*i«t inv«»i», i»>l«« <^ b. 
.^.^ genomic »K.ao .ci* to DNA c» be c«.c«d »4 *. P*>«. of 

AiWKr.=p«otfte»bi«inv«tfoni.teu«ottodi«lo«d,»cl»d*se,^ 
''''''^«p«=Bofto»bMin™«i'»tacl»dc>l«g««s«Kli»la«siden^^^^^ 

PS218G2. PS213E5, PS28C. PSS6BB1, PSS9I3. PS94R1, PS27J2, PS.OIDD. and PS202S. 

chimericto«mpKiteedl.yooatoogl»rt™sofni«iapl.t^ 

,.«on.pol^»».~«*«,u«K.oftos»^«.in^--'.*'''^-"*™'^_^'- 

Wica, involve n-odifie-ioo of «. g». » cptai» "■prert"' l"""; 

Al»m«iv.l,, «« i«,l«» of tk. i»«.«i«n. ^ ^^'^ 

.^to»xi^*»nWI»*»ta-4»«»«o.p»«. h*is.g»4*=i»ve„.<» 
i«U»«.«»»„.of».«»-.yi««*^ e*-o.teoomWn«,.««.— gto 
™^U»u«of«»i.v«^««^«"P«*»«-^P^ 

^fcrtopeflkHd-Wia Tieu»i.b«OB«««v.upo„ing«»io.l,y.Brg«»«ot 



H n ^ f TVsrrintir i ftf ttl'' Sftquences 
SEQ ID NO. 1 is a forward primer useful according to the subject invention. 
SEQ ID NO. 2 is a reverse primer usefid according to the subject mvention. 
SEQ ID NO. 3 is a forward primer useful according to the subject invention. 
SEQ ID NO. 4 is a reverse primer usefid according to the subject invention. 
SEQ ID NO. 5 is a forward primer usefol according to the subject invention . 
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SEQ ID NO. 6 is a reverse primer useful according to the subject invention. 
SEQ ID NO. 7 is an amino acid sequence of the toxin designated I IBl AR 
SEQ ID NO. 8 is a nucleotide sequence encoding an amino acid sequence of toxm 
11B1AR(SEQIDN0.7). 

SEQ ID NO. 9 is an amino acid sequaice of the toxin designated 1 IBIBR. 

SEQ ID NO. 10 is a nucleotide sequence encoding an amino acid sequence of toxin 

IIBIBR (SEQ ID NO. 9). 

SEQ ID NO. 11 is an amino acid sequence of the toxin designated 1291A. 

SEQ ID NO. 12 is a nucleotide sequence encoding an amino acid sequence of toxin 

1291A(SEQIDN0.11). 

SEQ ID NO. 13 is an amino acid sequence of the toxin designated 1292A. 

SEQ ID NO. 14 is a nucleotide sequence encoding an amino acid sequence of toxin 

1292A(SEQroNO. 13). 

SEQ ID NO. 15 is an amino acid sequence of the toxin designated 1292B. 

SEQ ID NO. 16 is a nucleotide sequence encoding an amino acid sequence of toxin 

1292B (SEQ ID NO. 15). 

SEQ ID NO. 17 is an amino acid sequence of the toxin designated 31GA. 

SEQ ID NO. 18 is a nucleotide sequence encoding an amino acid sequence of toxin 
31GA(SEQIDN0.17). 

SEQ ID NO. 19 is an amino acid sequence of fee toxin designated 31GBR. 

SEQ ID NO. 20 is a nucleotide sequence encoding an amino acid sequence of toxin 

31GBR (SEQ ID NO. 19). 

SEQ ID NO. 21 is an amino acid sequence of the toxin designated 85N1R identified by 

the mefliod of the subject invention. 

SEQ ID NO. 22 is a nucleotide sequence encoding an amino acid sequence of toxin 

85N1R(SEQIDN0.21). 

SEQ ID NO. 23 is an amino acid sequence of the toxm designated 85N2. 

SEQ ID NO. 24 is a nucleotide sequence encoding an amino acid sequence of toxin 
85N2(SEQIDN0.23). 

SEQ ID NO. 25 is aii amino acid sequence of fee toxin designated 85N3. 

SEQ ID NO. 26 is a nucleotide sequence encoding an amino acid sequence of toxin 
85N3 (SEQ ID NO. 25). 

SEQ ID NO. 27 is an amino acid sequence of fee toxin designated 86V1C1. 
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SEQ ID NO. 28 is a nucleotide sequence encoding an amino acid sequence of toxin 

86V1C1 (SEQ ID NO. 27). 

SEQ ID NO. 29 is an amino acid sequence of the toxin designated 86V1C2. 

SEQ ID NO. 30 is a nucleotide sequence encoding an amino acid sequence of toxin 
86VlC2(SEQIDNa29). 

SEQ ID NO. 31 is an amino acid sequence of the toxin designated 86V1C3R. 

SEQ ID NO. 32 is a nucleotide sequence encoding an amino acid sequence of toxin 

86V1C3R (SEQ ID NO. 31). 

SEQ ID NO. 33 is an amino acid sequence of the toxin designated F525A. 

SEQ ID NO. 34 is a nucleotide sequence encoding an amino acid sequence of toxin 
F252A(SEQIDNO. 33). 

SEQ ID NO- 35 is an amino acid sequence of the toxin designated F525B. 

SEQ ID NO. 36 is a nucleotide sequence encoding an amino acid sequence of toxin 
F525B(SEQIDN0.35). 

SEQ ID NO. 37 is an amino acid sequence of the toxin designated F525C. 

SEQ ID NO. 38 is a nucleotide sequence encoding an amino acid sequence of toxin 

F525C (SEQ ID NO. 37). 

SEQ ID NO. 39 is an amino acid sequence of the toxin designated F573A. 

SEQ ID NO. 40 is a nucleotide sequence encoding an amino acid sequence of toxin 

F573A(SEQIDNO. 39). 

SEQ ID NO. 41 is an amino acid sequence of flie toxin designated F573B. 

SEQ ID NO. 42 is a nucleotide sequence encoding an amino acid sequence of toxm 
F573B(SEQIDNO. 41). 

SEQ ID NO. 43 is an amino acid sequence of the toxin designated F573C. 

SEQ ID NO. 44 is a nucleotide sequence encoding an amino acid sequence of toxin 
F573C(SEQIDN0.43). 

SEQ ID NO. 45 is an amino acid sequence of the toxin designated FBBl A. 

SEQ ID NO. 46 is a nucleotide sequence encoding an amino acid sequence of toxin 
FBB1A(SEQIDN0.45). 

SEQ ID NO. 47 is an amino acid sequence of the toxin designated FBBIBR. 

SEQ ID NO. 48 is a nucleotide sequence encoding an amino acid sequence of toxin 
FBBIBR (SEQ ID NO. 47). 

SEQ ID NO. 49 is an amino acid sequence of the toxin designated FBBIC. 
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SEQ ID NO. 50 is a nucleotide sequence encoding an amino acid sequence of toxin 
FBB1C(SEQIDN0. 49). 

SEQ ID NO. 51 is an amino acid sequence of the toxin designated FBBID. 

SEQ ID NO. 52 is a nucleotide sequence encoding an amino acid sequence of toxin 
FBBID (SEQ ID NO. 51). 

SEQ ID NO. 53 is an amino acid sequence of the toxin designated J31 AR. 

SEQ ID NO. 54 is a nucleotide sequence encoding an amino acid sequence of toxin 
J31AR(SEQIDN0.53). 

SEQ ID NO. 55 is an amino acid sequence of the toxin designated J32AR. 

SEQ ID NO. 56 is a nucleotide sequence encoding an amino acid sequence of toxin 
J32AR(SEQIDNO. 55). 

SEQ ID NO. 57 is an amino acid sequence of the toxin designated WIFAR. 

SEQ ID NO. 58 is a nucleotide sequence encoding an amino acid sequence of toxin 
WIFAR (SEQ ID NO. 57). 

SEQ ID NO. 59 is an amino acid sequence of the toxin designated WIFBR, 

SEQ ID NO. 60 is a nucleotide sequence encoding an amino acid sequence of toxin 
WIFBR (SEQ ID NO. 59). 

SEQ ID NO. 61 is an amino acid sequence of the toxin designated WIFC. 

SEQ ID NO. 62 is a nucleotide sequence encoding an amino acid sequence of toxin 
WIFC (SEQ ID NO. 61). 

SEQ ID NO. 63 is an oligonucleotide useful as a PCR primer or hybridization probe 
according to the subject invention. 

SEQ ID NO. 64 is an oligonucleotide useful as a PCR primer or hybridization probe 
according to the subject invention. 

SEQ ID NO. 65 is an oligonucleotide useful as a PCR primer or hybridization probe 
according to the subject invention. 

SEQ ID NO. 66 is an oligonucleotide useful as a PCR primer or hybridization probe ' 
according to the subject mvention. 

SEQ ID NO. 67 is an oligonucleotide useful as a PCR primer or hybridization probe 
according to the subject invention. 

SEQ ID NO. 68 is an oligonucleotide useful as a PCR primer or hybridization probe 
according to the subject invention. 

SEQ ID NO. 69 is an oligonucleotide useful as a PCR primer or hybridization probe 
according to the subject invention. 
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SEQ ID NO. 70 is an amino acid sequence of the toxin designated 86BBl(a). 
SEQ ID NO. 71 is a nucleotide sequence encoding an amino acid sequence of toxin 
86BBl(a). 

SEQ ID NO. 72 is an amino acid sequence of the toxin desigimted 86BBl(b). 
SEQ ID NO. 73 is a nucleotide sequence encoding an amino acid sequence of toxin 
g6BBl(b). 

SEQ ID NO. 74 is an amino acid sequence of the toxin designated 31Gl(a). 
SEQ ID NO. 75 is a nucleotide sequence encoding an amino acid sequence of toxin 
-34Gl(a). 

SEQ ID NO. 76 is an amino acid sequence of the toxin designated 129HD chimeric. 
SEQ ID NO. 77 is a nucleotide sequence encoding an amino acid sequence of toxin 
129HD chimeric. 

SEQ ID NO. 78 is an amino acid sequence of the toxin designated 1 lB(a). 

SEQ ID NO. 79 is a nucleotide sequence encoding an amino acid sequence of toxin 

llB(a). 

SEQ ID NO. 80 is an amino acid sequence of Ae toxin designated 3 lGl(b). 
SEQ ID NO. 81 is a nucleotide sequence encoding an amino acid sequence of toxin 
31Gl(b). 

SEQ ID NO. 82 is an amino acid sequence of the toxin designated 86BBl(c). 
SEQ ID NO. 83 is a nucleotide sequence encoding an amino acid sequence of toxin 
86BBl(c). 

SEQ ID NO. 84 is an amino acid sequence of the toxin designated g6Vl(a). 
SEQ ID NO. 85 is a nucleotide sequence encoding an amino acid sequence of toxin 
86Vl(a). 

SEQ ID NO. 86 is an amino acid sequence of the toxin designated 86Wl(a). 
SEQ ID NO. 87 is a nucleotide sequence encoding an amino acid sequence of toxin 
86Wl(a). 

SEQ ID NO. 88 is a partial amino acid sequence of the toxin designated 94Rl(a). 
SEQ ID NO. 89 is a partial nucleotide sequence encoding an amino acid sequence of 
toxin 94Rl(a). 

SEQ ID NO. 90 is an amino acid sequence of the toxin designated 1 85U2(a). 
SEQ ID NO. 91 is a nucleotide sequence encoding an amino acid sequence of toxin 
185U2(a). 

SEQ ID NO. 92 is an amino acid sequence >of the toxin designated 202S(a). 
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SEQ ID NO. 93 is a nucleotide sequence encoding an amino acid sequence of toxin 
202S(a). 

SEQ ID NO. 94 is an amino acid sequence of the toxin designated 213E5(a). 
SEQ ID NO. 95 is a nucleotide sequence encoding an amino acid sequence of toxin 
213E5(a). 

SEQ ID NO. 96 is an amino acid sequence of the toxin designated 21 8G2(a). 
SEQ ID NO. 97 is a nucleotide sequence encoding an amino acid sequence of toxin 
218G2(a). 

SEQ ID NO. 98 is an amino acid sequence of the toxin designated 29HD(a). 
SEQ ID NO. 99 is a nucleotide sequence encoding an amino acid sequence of toxin 
29HD(a). 

SEQ ID NO. 100 is an amino acid sequence of the toxin designated 1 lOHD(a). 
SEQ ID NO. 101 is a nucleotide sequence encoding an amino acid sequence of toxin 
110HD(a). 

SEQ ID NO. 102 is an amino acid sequence of the toxin designated 129HD(b). 
SEQ ID NO. 103 is a nucleotide sequence encoding an amino acid sequence of toxin 
129HD(b). 

SEQ ID NO. 104 is a partial amino acid sequence of the toxin designated 573HD(a). 
SEQ ID NO. 105 is a partial nucleotide sequence encoding an amino acid sequence of 
toxin 573HD(a). 

Detailed Disc losure of the Invention 

The subject invention concerns materials and methods for the control of non-mammalian 
pests. In specific embodiments, the subject invention pertains to new Bacillus thuringiensis 
isolates and toxins which have activity against lepidopterans. In a particularly preferred 
embodiment, the toxins and methodologies described herein can be used to control black 
cutworm. The subject invention further concerns novel genes which encode pesticidal toxins 
and novel methods for identifying and characterizing J?.r. genes which encode toxins wifli useful 
properties. The subject invention concerns not only the polynucleotide sequences which encode 
these toxins, but also the use of these polynucleotide sequences to produce recombinant hosts 
which express the toxins. 

Certain proteins of the subject invention are distinct from the crystal or "Ciy" proteins 
which have previously been isolated from Bacillus thuringiensis. 
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A further aspect of the subject invention concerns novel isolates and the toxins and 
genes obtainable from these isolates. The novel B.t. isolates of the subject invention have been 
designated PS31G1. PS185U2. PSUB. PS218G2. PS213E5. PS28C, PS86BB1. PS89J3. 
PS94R1, PS202S, PSIOIDD, and PS27J2. 

The new toxins and polynucleotide sequences provided here are defined according to 
several parameters. One critical characteristic of the toxins described herein is pesticidal 
activity. In a specific embodiment, these toxins have activity against lepidopteran pests. The 
toxins and genes of the subject invention can be further defined by their amino add and 
nucleotide sequences. The sequences of the molecules can be defmed in terms of homology to 
certain exemplified sequencesas well as in terms of the abihty to hybridize with, orte 

by. certain exemplified probes and primers. The toxins provided herein can also be identified 
based on their immunoreactivity with cralain antftodies. 

Methods have been developed for making usefiil chimeric toxins by combining portions 
of A/, crystal proteins. The portions which are combined need not, themselves, be pesti^^^ 
longasthecombinationofportionscreatesachimericproteinwhichispesticidal. Thiscanbe 
done using restriction enzymes, as described in, for example, European Patent 0 228 838; Ge. 
A.Z.. N.L. Shivarova. D.H. Dean (1989) Proc. Natl. Acad. Sci. USA 86:4037-4041; Ge, AZ., 
D Rivers. R. Mibe, DB. Dean (1.991)7. SioL Chem. 266:17954-17958; Schnepf. H.E., K. 
Tomczak. J.P. Ortega, H.R. Whiteley (1990) J. Biol Chem. 265:20923-20930; Honee. G.. D. 
Convents. J. Van Rie. S. Jansens, M, Peferoen. B. Visser (1991) Mol Microbiol. 5.2799-2806. 
Alternatively, recombination using cellular recombimition mechanisms can be used to achieve 
simihir results. See, for example, Caramori, T.. AM. Albertini. A. Galizzi (1991) Gene 98:37- 
44;Widner.W.R.,H.R.Whiteley(1990)J.flac<erto/. 172.2826-2832; Bosch. D., B. Schipper. 

H.'van der Klicj. RA. de Maagd, W.J. Stickema (1994) Biotechnology 12:915-918. A number 
of other methods are known in the artbywhichsuch chimeric DNAscanbe nude. The subject 

invention is meant to include chimeric proteins that utilize the novel sequent 
subject application. 

With the teachings provided herein, one drilled in fte art could readily produce and use 

the various toxins and polynucleotide sequences described herein. 

B.t. isolates useful according to the subject invention have been deposited in the 
pemument coUection of fte Agricultural Research Service Patent Culture Collection (NRRL), 
Northern Regional Research Center. 1815 North University Street, Peoria, lUinois 61604. USA. 
The culture repository numbers of the B.t. strains are as follows: 
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BftpftsitorvNo. 
NRIlLB-21556 
NRRLB-21557 
NRRLB-21558 
NRRLB-21559 
NRRLB-21560 
NRRLB-21561 
NRRLB-21562 
NRRLB-21799 
NRRLB-21800 
NRRLB-21801 
NRIlLB-21802 
NRRLB-21803N 
NRRLB-21804 
NRRLB-2180S 
NRRLB-21794 

NRRLB-21795 
NRRLB-21796 



B./.PS11B(MT274) 

B.t. PS86BB1 (MT275) 

B.t. PS86V1 (MT276) 

B.t. PS86W1 (MT277) 

B.t. PS31G1 (MT278) 

B.t. PS89J3 (MT279) 

At. PS185U2 (MT280) 

B./. PS27J2 

5./. PS28E 

B.t. PS94R1 

5./. PSIOIDD 

A/. PS202S 

B.t. PS213E5 

B.t PS218G2 

£.co//NM522(MR922) 
(pMYC2451) 

£.co/jNM522(MR923) 

(pMYC2453) 

£. co//NM522(MR924) 
(pMYC2454) 

Cultures which have been deposited for the purposes of this patent application were 
deposited under conditions that assure that access to the cultures is available during the 
pendency of this patent application to one determined by the Commissioner of Patents and 
Trademarks to be entitled thereto under 37 CFR 1 . 14 and 35 U.S.C. 122. The deposits will be 
available as required by foreign patent laws in countries wherein counteiparts of flie subject 
application, or its progeny, are filed. However, it should be understood tiiat the availability of 
a deposit does not constitute a license to practice the subject invention in derogation of patent 
rights granted by governmental action. 

Further, the subject culture deposits will be stored and made available to the public in 
accord with tiie provisions of the Budapest Treaty for the Deposit of Microorganisms, i.e., they 
will be stored with all the care necessary to keep them viable and uncontaminated for a period 
of at least five years after the most recent request for the furnishing of a sample of the deposit, 
and in any case, for a period of at least thirty (30) years after the date of deposit or for the 



April 18, 1996 
April 18, 1996 
April 18, 1996 
April 18, 1996 
kpn\ 18, 1996 
April 18, 1996 
April 18, 1996 
July 1, 1997 
July 1, 1997 
July 1.1997 
July 1, 1997 
July 1. 1997 
July 1, 1997 
July 1, 1997 
June 27, 1997 

June 27, 1997 
June 27, 1997 
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enforceable life of any patent which may issue disclosing the culture(s). The depositor 
acknowledges the duty to replace the deposit(s) should the depository be unable to furnish a 
sample when requested, due to the condition of a deposit. All restrictions on the availability to 
the public of the subject culture deposits will be irrevocably removed upon the granting of a 
patent disclosing them. 

Following is a table which provides characteristics of certain isolates useful according 
to the subject invention. 



Table 1. Description of BJ, sirains toxic to lepidopterans 


Culture 


Crystal Description 


Approx. MW (kDa) 


Serotype 


PS185U2 


small bipyramid 


130 kDa doublet, 70 kDa 


ND 


PSllB 


bipyramid tort 


130 kDa, 70 kDa 


ND 


PS218G2 


amorphic 


135 kDa, 127 kDa 


ND 


PS213E5 


amorphic 


130 kDa 


ND 


PS86W1 


multiple amorphic 


130 kDa doublet 


5a5b gatteriae 


PS28C 


amorphic 


130 kDa triplet 


Sa5b gatteriae 


PS86BB1 


BP without 


130 kDa doublet 


5a5b gatteriae 


PS89J3 


spherical/amorphic 


130 kDa doublet 


ND 


PS86V1 


BP 


130 kDa doublet 


ND 


PS94R1 


BP and amorphic 


130 kDa doublet 


ND 


HD525 


BP and amorphic 


130 kDa 


not motile 


HD573 


multiple amorphic 


135 kDa, 79 ld)a doublet, 72 kDa 


not motile 


PS27J2 


lemon-shaped 


130 kDa 50 kDa 


4 (sotto or 
kenvae) 



ND » not d^ermined 



In one embodiment, the subject invention concerns materials and methods including 
nucleotide primers and probes for isolating and identifying Bacillus thuringiensis (BJ.) genes 
encoding protein toxins which are active against lepidopteran pests. The nucleotide sequences 
described herein can also be used to identify new pesticidal BJ. isolates. The invention further 
concerns the genes, isolates, and toxins identified using the methods and materials disclosed 
herein. 

Genes and toxins . The genes and toxins useful according to the subject invention 
include not only the full length sequences but also fragments of these sequences, variants, 
mutants, and fusion proteins which retain the characteristic pesticidal activity of the toxins 
specifically exemplified herein. Chimeric genes and toxins, produced by combining portions 
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from more than one B.L toxin or gene, may also be utilized according to the teachings of the 
subject invention. As used herein, the terms "variants" or ^Variations" of genes refer to 
nucleotide sequences which encode the same toxins or which encode equivalent toxins having 
pcsticidal activity. As used herein, the tenn "equivalent toxins" refers to toxins having the same 
or essentially the same biological activity against the target pests as the exemplified toxins. 

It should be apparent to a person skilled in this art that genes encoding active toxins can 
be identified and obtained through several means. The specific genes exemplified herein may 
be obtained from the isolates deposited at a culture depositoiy as described above. These genes, 
or portions or variants thereof,"may also be constructed synthetically, for example, by use of a 
gene synthesizer. Variations of genes may be readily constructed using standard techniques for 
making point mutations. Also, fragments of these genes can be made using conmiercially 
available exonucleases or endonucleases according to standard procedures. For example, 
enzymes such as Bal3l or site-directed mutagenesis can be used to systematically cut off 
nucleotides from the ends of these genes. Also, genes which encode active fragments may be 
obtained using a variety of restriction enzymes. Proteases may be used to directly obtain active 
fragments of these toxins. 

Equivalent toxins and/or genes encoding these equivalent toxins can be derived from 
BJ. isolates and/or DNA libraries using the teachings provided herein. There are a number of 
methods for obtaining the pesticidal toxins of the instant invention. For example, antibodies to 
the pesticidal toxins disclosed and claimed herein can be used to identify and isolate other toxins 
from a mixture of proteins. Specifically, antibodies may be raised to the portions of the toxins 
which are most constant and most distinct from other B,t toxins. These antibodies can then be 
used to specifically . identify equivalent toxins with the characteristic activity by 
immunoprecipitation, enzyme linked immunosorbent assay (ELISA), or western blotting. 
Antibodies to the toxins disclosed herein, or to equivalent toxins, or fragments of these toxins, 
can readily be prepared using standard procedures in this art. The genes which encode these 
toxins can then be obtained fit)m the microorganism. 

Fragments and equivalents which retain the pesticidal activity of flie exemplified toxins 
would be within the scope of flie subject invention. Also, because of the redundancy of the 
genetic code, a variety of different DNA sequences can encode the amino acid sequences 
disclosed herein. It is well within the skill of a person trained in the art to create these 
alternative DNA sequences encoding the same, or essentially the same, toxins. These variant 
DNA sequences are within the scope of the subject invention. As used herein, reference to 
"essentially the same" sequence refers to sequences which have amino acid substitutions. 
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deletions, additions, or insertions which do not materially affect pesticidal activity. Fragments 
retaining pesticidal activity are also included in this definition. 

A further method for identifying the toxins and genes of the subject invention is through 
the use of oligonucleotide probes. These probes are detectable nucleotide sequences. Probes 
provide a rapid method for identifying toxin-encoding genes of the subject invention. The 
nucleotide segments which are used as probes according to the invention can be synthesized 
using a DNA synthesizer and standard procedures. 

Certain toxins of the subject invention have been specifically exemplified herein. Since 
these toxins are merely exemplary of the toxins of the subject invention, it should be readily 
apparent that the subject invention comprises variant or equivalent toxins (and nucleotide 
sequences coding for equivalent toxins) having the same or similar pesticidal activity of the 
exemplified toxin. Equivalent toxins will have amino acid homology with an exemplified toxin. 
This amino acid identity will typically be greater than 60%, preferably be greater than 75%, 
more preferably greater than 80%, more preferably greater than 90%, and can be greater than 
95%. The amino acid homology will be highest in critical regions of the toxin which account for 
biological activity or are involved in the determination of three-dimensional configuration which 
ultimately is responsible for the biological activity. In this regard, certain amino acid 
substitutions are acceptable and can be expected if these substitutions are in regions which are 
not critical to activity or are conservative amino acid substitutions which do not affect the three- 
dimensional configuration of the molecule. For example, amino acids may be placed in the 
following classes: non-polar, uncharged polar, basic, and acidic. Conservative substitutions 
whereby an amino acid of one class is replaced with another amino acid of the same type fall 
within the scope of the subject invention so long as the substitution does not materially alter the 
biological activity of the compound. Table 2 provides a listing of examples of amino acids 
belonging to each class. 
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Table 2. 

Class of Amino Acid Examples of Amino Acids 

Nonpolar Ala, Val, Leu, He, Pro, Met, Phe, Trp 

Uncharged Polar Gly, Ser, Thr, Cys, Tyr, Asn, Gin 

Acidic Asp, Glu 

Basic _Lys,Arg,ffis 



In some instances, non-conservative substitutions can also be liiade. The critical factor 
is that these substitutions must not significantly detract fi-om the biological activity of the toxin. 

The toxins of the subject invention can also be characterized in terms of the shape and 
location of toxin inclusions, which are described above. 

As used herein, reference to "isolated" polynucleotides and/or **purified" toxins refers 
to these molecules when they are not associated with the other molecules with which they would 
be found in nature. Thus, **purified" toxins would include, for example, the subject toxins 
expressed in plants. Refaence to "isolated and purified" signifies the involvement of the "hand 
of man" as described herein. Chimeric toxins and genes also involve the "hand of man." 

Recombinant hosts . The toxin-encoding genes harbored by the isolates of the subject 
invention can be introduced into a wide variety of microbial or plant hosts. Expression of tfie 
toxin gene results, directly or indirectly, in the intracellular production and maintenance of the 
pesticide. With suitable microbial hosts, e.g., Pseudomonas, the microbes can be applied to the 
situs of the pest, where they will proHferate and be ingested. The resuU is a control of the pest 
Alternatively, the microbe hosting the toxin gene can be treated under conditions that prolong 
the activity of the toxin and stabilize the cell. The treated cell, which retains the toxic activity, 
then can be applied to the enviroximent of tiie target pest. 

Where the BJ. toxin gene is mtroduced via a suitable vector into a microbial host, and 
said host is applied to the environment in a living state, it is essential that certain host microbes 
be used. Microorganism hosts are selected which are known to occupy the ^'phytosphere" 
(phylloplane, phyllosphere, rhizosphere, and/or rhizoplane) of one or more crops of interest 
These microorganisms are selected so as to be capable of successfiiUy competing in the 
particular environment (crop and other insect habitats) with the wild-type microorganisms, 
provide for stable maintenance and expression of the gene expressing the polypeptide pesticide, 
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and, desirably, provide fa: improved protection of tiic pesticide fiom environmental degradation 
and inactivation. 

A large number of microorganisms are known to inhabit the phylloplane (the surface 
of the plant leaves) and/or the rhizosphcre (the soil surrounding plant roots) of a wide variety 
5 of important crops. These microorganisms include bacteria, algae, and fungi. Of particular 
interest are microorganisms, such as bacteria, e.g., genera Pseudomonas, Erwinia, Serratia, 
Klebsiella, Xanthomonas, Streptomyces, Rhizobium, Rhodopseudomonas, Methylophilius, 
Agrobacterium, Acetobacter, Lacto^cillus, Arthrobacter, Azotobacter, Leuconostoc, and 
Alcaligenes; fimgi, particularly yeast, e.g., genera Sacckaromyces, Cryptococcus, 
10 Kl^r>eromyces, Sporobolomyces, Rhodotorula, and Aureobasidium. Of particular interest are 
such phytosphere bacterial species as Pseudomonas syringae, Pseudomonas fluorescens, 
Serratia marcescens, Acetobacter xylinum, Agrobacterium tumefaciens, Rhodopseudomonas 
spheroides, Xanthomonas campestris, Rhizobium mdioti, Alcaligenes entrophus, and 
Azotobacter vinlandii; and phytosphere yeast species sudi as Rhodotorula rubra, R. glutinis, R. 
15 marina, K aurantiaca, Cryptococcus albidus, C diffluens, C. laurentii, Saccharomyces rosei, 
S. pretoriensis, S. cerevisiae, Sporobolomyces roseus, S. odorus, Khtyveromyces veronae, and 
Aureobasidium pollulans. Of particular interest are the pigmented microOTganisms. 

A wide variety of ways are available for introducmg a B.t. gene encoding a toxin into 
a microorganism host under conditions which allow for stable maintenance and expression of 
20 the gene. These methods are well known to those skilled in the art and are described, for 
example, m United States Patent No. 5,135,867. which is incorporated herein by reference. 

Control of lepidopterans. including bUick cutworm, using fte isolates, toxins, and genes 
of the subject invention can be accomplished by a variety of methods known to those skilled in 
the art. These methods include, for example, the application of At isolates to the pests (or their 
25 location), the application of recombinant microbes to the pests (or their locations), and the 
transformation of plants with genes which encode fte pesticidal toxins of the subject invention. 
Recombinantmicrobesmaybe,forexample,aA/.,£. coli, at Pseudomonas. Transformations 
can be made by those skilled in the art using standani techniques. Materials necessary for these 
transformations arc disclosed herein or are otherwise readily available to the skilled artisan. 
30 SynAetic genes which are functionally equivalent to the toxins of the subject invention 

can also be used to transfoim hosts. Methods for the production of synthetic genes can be found 
in, for example, U.S. Patent No. 5,380,831. 

Trftatment of cells . As mentiwied above, B.t. or recombinant cells expressing a B.t. 
toxin can be treated to prolong the toxin activity and stabilize the cell. The pesticide 
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microcapsule that is formed comprises the BJ, toxin within a cellular structure that has been 
stabilized and will protect the toxin when the microcapsule is appHed to the environment of the 
target pest. Suitable host cells may include either prokaryotes or eukaryotes, normally being 
limited to those cells which do not produce substances toxic to higher organisms, such as 
5 mammals. However, organisms which produce substances toxic to higher organisms could be 
used, where the toxic substances are unstable or the level of application sufficiently low as to 
avoid any possibility of toxicity to a mammalian host. As hosts, of particular interest will be 
the prokaryotes and the lower eukaryotes, such as fimgi. ^ 

The cell will usually be intact and be substantially in the proliferative form when 

10 treated, rather than in a spore form, although in some instances spores may be employed. 

Treatment of the microbial cell, eg., a microbe containing the B.t toxin gene, can be 
by chemical or physical means, or by a combination of chemical and/or physical means, so long 
as the technique does not deleteriously affect the properties of the toxin, nor diminish the 
cellular capability of protecting the tpxin. Examples of chemical reagents are halogenating 

1 5 agents, particularly halogens of atomic no. 1 7-80. More particularly, iodine can be used under 
mild conditions and for sufficient time to achieve the desired results. Other suitable techniques 
include treatment with aldehydes, such as glutaraldehyde; anti-infectives, such as zephiran 
chloride and cetylpyridinium chloride; alcohols, such as isopropyl and ethanol; various 
histologic fixatives, such as Lugol iodine, Bouin's fixative, various acids and Helly*s fixative 

20 (See: Humason, Gretchen L., Animal Tissue Techniques, W.H. Freeman and Company, 1967); 

or a combination of physical (heat) and chemical agents that preserve and prolong the activity 
of the toxin produced in the cell when the cell is administered to the host environment. 
Examples of physical means are short wavelength radiation such as gamma-radiation and X- 
radiation, fiwzing, UV irradiation, lyophilization, and the like. Methods for treatment of 

25 microbial cells are disclosed in United States Patent Nos. 4,695,455 and 4,695,462, which are 
incorporated herein by reference. 

The cells generally will have enhanced structural stability which will enhance resistance 
to envirormiental conditions. Where the pesticide is in a proform, the method of cell treatment 
should be selected so as not to inhibit processing of the proform to the mature form of the 

30 pesticide by flic target pest pathogen. For example, formaldehyde will crosslink proteins and 
could inhibit processing of the proform of a polypeptide pesticide. The method of treatment 
should retain at least a substantial portion of the bio-availability or bioactivity of the toxin. 

Characteristics of particular interest in selecting a host cell for purposes of production 
include ease of introducing the B,L gene into the host, availability of expression systems, 
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efficiency of expression, stability of the pesticide in the host, and the presence of auxiliary 
genetic capabilities. Characteristics of interest for use as a pesticide microcapsule include 
protective qualities for the pesticide, such as thick cell walls, pigmentation, and intracellular 
packaging or formation of inclusion bodies; survival in aqueous environments; lack of 
mammalian toxicity; attractiveness to pests for ingestion; ease of killing and fixing without 
damage to the toxin; and the like. Other considerations include ease of formulation and 
handling, economics, storage stability, and the like. 

Growth of cells . The cellular host containing the B,L insecticidal gene may be.grown 
in smy convenient nutrient medium, where the DNA construct providesTselective advantage, 
providing for a selective medium so that substantially all or all of the cells retain the B,t gene. 
These cells may then be harvested m accordance with conventional ways. Alternatively, the 
cells can be treated prior to harvesting. 

The B.t cells of the invention can be cultured using standard art media and fermentation 
techniques. Upon completion of the fermentation cycle the bacteria can be harvested by first 
separating the B.L spores and crystals from the fermentation broth by means well known in the 
art. The recovered B,t spores and crystals can be formulated into a wettable powder, liquid 
concentrate, granules or otiier formulations by the addition of surfactants, dispersants, inert 
carriers, and other components to fecilitate handling and application for particular tai^et pests. 
These formulations and application procedures are all well known in the art. 

Methods and formulations fo r cantrol of nests. Control of lepidopterans using the 
isolates, toxins, and genes of the subject invention can be accomplished by a variety of methods 
known to those skilled in the art. These methods mclude, for example, the application ofBx 
isolates to flie pests (or their location), the application of recombinant microbes to the pests (or 
their locations), and the transformation of plants with genes v/hich encode the pcsticidal toxins 
of the subject invention. Recombinant microbes may be, for example, a fi./., E. coli, or 
Pseudomonas. Transformations can be made by" those skilled in the art using standard 
techniques. Materials necessaiy for these transformations are disclosed herein or are otherwise 
readily available to the skilled artisan. 

Formulated bait granules containing an attractant and spores and crystals of the B,t 
isolates, or recombinant microbes comprising the genes obtainable from the B.t isolates 
disclosed herein, can be applied to the soil. Formulated product can also be applied as a seed- 
coating or root treatment or total plant treatment at later stages of the crop cycle. Plant and soil 
treatments of A/, cells may be employed as wettable powders, granules or dusts, by mixing with 
various inert materials, such as inorganic minerals (phyllosilicates, carbonates, sulfates, 
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phosphates, and the hke) or botanical materials (powdered corncobs, rice hulls, wataut shells, 
and the like). The formulations may include spreader-sticker adjuvants, stabilizing agents, other 
pesticidal additives, or surfactants. Liquid formulations may be aqueous-based or non-aqueous 
and employed as foams, gels, suspensions, emulsifiable concentrates, or the like. The 

5 ingredients may include rheological agents, surfactants, emulsifiers, dispcrsants, or polymers. 

As would be appreciated by a person skilled in the art, the pesticidal concentration will 
vary widely depending upon the nature of the particular formulation, particularly whether it is 
a concentrate or to be used directly. The pesticide will be present in at .least 1% by weight and 
may be 100% by weight. The dry formulations will havelifom about 1-95% by weight of the 
10 pesticide while the liquid formulations will generally be from about 1-60% by weight of the 

^ solids in the liquid phase. The formulations will generally have from about 10^ to about W 
cells/mg. These formulations will be administered at about 50 mg (liquid or dry) to 1 kg or 
more per hectare. 

The formulations can be applied to the environment of the pest, soil and foliage, 

15 by spraying, dusting, sprinkling, or the like. 

Mutants . Mutants of the isolates of the invention can be made by procedures well 
known in the art For example, an asporogenous mutant can be obtained through ethylmethane 
sulfonate (EMS) mutagenesis of an isolate. The mutants can be made using ultraviolet light and 
nitrosoguanidine by procedures well known in the art. 

20 A smaller percentage of the asporogenous mutants will remain intact and not lyse for 

extended fermentation periods; these strains are designated lysis minus (-). Lysis minus strains 
can be identified by screening asporogenous mutants in shake flask media and selecting those 
mutants that are still intact and contain toxin crystals at the end of the fermentation. Lysis 
minus strains are suitable for a cell treatment process that will yield a protected, encapsulated 

25 toxin protein. 

To prepare a phage resistarifvariant of said asporogenous mutant, an aliquot of the 
phage lysate is spread onto nutrient agar and allowed to dry. An aliquot of the phage sensitive 
bacterial strain is then plated directly over the dried lysate and allowed to dry. The plates are 
incubated at 30**C. The plates are incubated for 2 days and, at that time, numerous colonies 

30 could be seen growing on the agar. Some of these colonies are picked and subcultured onto 
nutrient agar plates. These apparent resistant cultures are tested for resistance by cross streaking 
with the phage lysate. A line of the phage lysate is streaked on the plate and allowed to dry. 
The presumptive resistant cultures are then streaked across the phage line. Resistant bacterial 
cultures show no lysis anywhere in the streak across the phage line after overnight incubation 



/ 
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at 30**C. The resistance to phage is then reconfirmed by plating a lawn of the resistant culture 
onto a nutrient agar plate. The sensitive strain is also plated in the same manner to serve as the 
positive control. After drying, a drop of the phage lysate is placed in the center of the plate and 
allowed to dry. Resistant cultures showed no lysis in the area where the phage lysate has been 
5 placed after incubation at 30 **C for 24 hours. 

PQlynUP^Qtid? prpbgg. It is well known that DNA possesses a ftmdamental property 
called base complementarity. In nature, DNA ordinarily exists in the form of pairs of anti- 
parallel strands, the bases on each strand projecting from that strand toward the opposite strand. 
The base adenine (A) on one strand will always be opposed to the base thymine (T) on the^odier 

10 strand, and the base guanine (G) will be opposed to the base cytosine (C). The bases are held 

in apposition by their ability to hydrogen bond in this specific way. Though each individual 
bond is relatively weak, the net effect of many adjacent hydrogen bonded bases, together with 
base stacking effects, is a stable joining of the two complementary strands. These bonds can be 
broken by treatments such as high pH or high temperature, and these conditions result in the 

15 dissociation, or "denaturation," of the two strands. If the DNA is then placed in conditions 

which make hydrogen bonding of the bases thermodynamically favorable, the DNA strands will 
anneal, or "hybridize," and reform the origmal double stranded DNA. If carried out under 
appropriate conditions, this hybridization can be highly specific. That is, only strands with a 
high degree of base complementarity will be able to form stable double stranded structures. The 

20 relationship of the specificity of hybridization to reaction conditions is well known. Thus, 
hybridization may be used to test whether two pieces of DNA are complementary in their base 
sequences. It is this hybridization mechanism which facilitates the use of probes of the subject 
invention to readily detect and characterize DNA sequences of interest. 

The probes may be RNA gt DNA. The probe will normally have at least about 1 0 bases, 

25 more usually at least about 18 bases, and may have up to about SO bases or more, usually not 
having more tiian about 200 bases if the probe is made syntiietically . However, longer probes 
can readily be utilized, and such probes can be, for example, several kilobases in length. The 
probe sequence is designed to be at least substantially complementary to a gene encoding a toxin 
of interest. The probe need not have perfect complementarity to the sequence to which it 

30 hybridizes. The probes may be labelled utilizing techiuques which are well known to those 
skilled in this art. 

One approach for the use of the subject invention as probes entails first identifying by 
Southern blot analysis of a gene bank of the Bx isolate all DNA segments homologous with the 
disclosed nucleotide sequences. Thus, it is possible, without the aid of biological analysis, to 
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know in advance the probable activity of many ncvfBX isolates, and of the individual endotoxin 
gene products expressed by a given B.L isolate. Such a probe analysis provides a rapid method 
for identifying potentially commercially valuable insccticidal endotoxin genes within the 
multifarious subspecies of BJ, 

One hybridization procedure useful according to the subject invention typically includes 
the initial steps of isolating tiie DNA sample of interest and purifying it chemically. Either lysed 
bacteria or total fractionated nucleic acid isolated from bacteria can be used. Cells can be 
tteated using known techniques to liberate their DNA (and/or RNA). The DNA sample can be 
cut into pieces with an appropriate restriction enzyme. The pieces can be separated by size 
through electrophoresis in a gel, usually agarose or aciylamide. The pieces of interest can be 
transferred to an immobilizing membrane in a manner that retains the geometry of the pieces. 
The membrane can then be dried and prehybridized to equilibrate it for later immersion in a 
hybridization solution. The manner in which the nucleic acid is affixed to a solid support may 
vary. This fixing of the DNA for later processing has great value for the use of this technique 
in field studies, remote from laboratory facilities. 

The particular hybridization technique is not essential to the subject invention. As 
improvements are made in hybridization techniques, they can be readily applied. 

As is well known in the art, if the probe molecule and nucleic acid sample hybridize by 
forming a strong non-covalent bond between the two molecules, it can be reasonably assumed 
that the probe and sample are essentially identical. The probe's detectable label provides a 
means for determining in a known manner whether hybridization has occurred. 

The nucleotide segments of the subject invention which are used as probes can be 
synthesized by use of DNA synthesizers using standard procedures. In the use of the nucleotide 
segments as probes, the particular probe is labeled widi any suitable label known to those skilled 
in the art, including radioactive and non-radioactive labels. Typical radioactive labels include 
"P, ^*S, or the like. A probe labeled with a radioactive isotope can be constructed from a 
nucleotide sequence complemaitaiy to the DNA sample by a conventional nick translation 
reaction, using a DNase and DNA polymerase. The ptobe and sample can theii be combined in 
a hybridization buffer solution and held at an appropriate temperature until annealing occurs. 
Thereafter, the membrane is washed free of extraneous materials, leaving the sample and bound 
probe molecules typically detected and quantified by autoradiography and/or liquid scintillation 
counting. For syntiietic probes, it may be most desirable to use en2ymes such as polynucleotide 
kinase or terminal transferase to end-label the DNA for use as probes. 
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Non-radioactive labels include, for example, ligands such as biotin or fliyroxine, as well 
as enzymes such as hydrolases or perixodases, or the various chemiluminescers such as 
luciferin, or fluorescent compounds like fluorescein and its derivatives. The probes may be 
made inherently fluorescent as described in International Application No. WO93/16094. The 
probe may also be labeled at both ends with different types of labels for ease of separation, as, 
for example, by using an isotopic label at the end mentioned above and a biotin label at the other 
end. 

The amount of labeled probe which is present in the hybridization solution .will vary 
widely, depending upon the nature of the label, the-amount of the'labeled probe which can 
reasonably bind to the filter, and the stringency of the hybridization. Generally, substantial 
excesses of the probe will be employed to enhance the rate of binding of the probe to the fixed 
DNA. 

Various degrees of stringency of hybridization can be employed. The more severe the 
conditions, the greater the complementarity that is required for duplex formation. Severity can 
be controlled by temperature, probe concentration, probe length, ionic strength, time, and the 
like. Preferably, hybridization is conducted under stringent conditions by techniques well 
known in the art, as described, for example, in Keller, G.H., M.M. Manak (1987) DNA Probes, 
Stockton Press, New York, NY., pp. 169-170. 

As used herein "stringent" conditions for hybridization refers to conditions which 
achieve the same, or about the same, degree of specificity of hybridization as the conditions 
employed by the current applicants. Specifically, hybridization of immobilized DNA on 
Southern blots with 32P-labeled gene-specific probes was performed by standard methods • 
(Maniatis, T., E.F. Frilsch, J. Sambrook [1982] Molecular Cloning: A Laboratory Manual, Cold 
Spring Harbor Laboratory, Cold Spring Harbor, NY). In general, hybridization and subsequent 
washes were carried out under stringent conditions that allowed for detection of target sequences 
with homology to the exemplified toxin genes. For double-stranded DNA gene probes, 
hybridization was carried out overnight at 20-25** C below flie melting temperature (Tm) of the 
DNA hybrid in 6X SSPE, SXDcnhardfs solution, 0.1% SDS, 0.1 mg/ml denatured DNA. The 
melting temperature is described by the following formula (Beltz, G.A., KA. Jacobs, T.H. 
Eickbush, P.T. Cherbas, and F.C. Kafetos [1983] Methods ofEnzymology, R. Wu, L. Grossman 
and K. Moldave [eds.] Academic Press, New York 100:266-285). 

Tm=81.5*' C+16.6 Log[Na+]+0.41(%G+C)-0.61(%forniamide)-600/length of duplex 

in base pairs. 

Washes are typically carried out as follows: 
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(1) Twice at room temperature for 15 minutes in IX SSPE, 0.1% SDS (low 
stringency wash). 

(2) Once at Tm-20*^C for 15 minutes in 0.2X SSPE, 0.1% SDS (moderate 
stringency wash). 

For oligonucleotide probes, hybridization was carried out overnight at 10-20'*C below 
the melting temperature (Tm) of the hybrid in 6X SSPE, 5X DenhardVs solution, 0.1% SDS, 0.1 
mg/ml denatured DNA. Tm for oligonucleotide probes was determined by the following 

formula: — 

Tm (** C)=2(number T/A base-pairs) +4(number G/C base pairs) 

(Suggs, S.V., T. Miyake. E.H. Kawashime, MJ. Johnson, K. Itakura. and R.B. Wallace [1981] 

ICN'UCLA Symp. Dev. Biol Using Purified Genes, D.D. Brown [ed.], Academic Press, New 

York, 23:683-693). 

Washes were typically carried out as follows: 

(1) Twice at room temperature for 15 minutes IX SSPE, 0.1% SDS Oow stringency 
wash). 

(2) Once at the hybridization temperature for 15 minutes in IX SSPE, 0,1% SDS 
(moderate stringency wash). 

Duplex formation and stability depend on substantial complementarity between the two 
strands of a hybrid, and, as noted above, a certain degree of mismatch can be tolerated. 
Therefore, the nucleotide sequences of the subject invention include mutations (both single and 
multiple), deletions, insertions of the described sequences, and combinations thereof, wherein 
said mutations, insertions and deletions permit formation of stable hybrids with the target 
polynucleotide of interest. Mutations, insertions, and deletions can be produced in a given 
polynucleotide sequence in many ways, and these methods are known to an ordinarily skilled 
artisan. Other methods may become known in the fuUire. 

The known methods include, but are not limited to: 

(1) synthesizmg chemically or o&erwise an artificial sequence which is a mutation, 
insertion or deletion of the known sequence; 

(2) using a nucleotide sequence of the present invention as a probe to obtain via 
hybridization a new sequence or a mutation, insertion or deletion of the probe 
sequence; and 

(3) mutating, inserting or deleting a test sequence in vitro or in vivo. 
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It is important to note that the mutational, inscrtional, and delctional variants generated 
from a given probe may be more or less efficient than the original probe. Notwithstanding such 
differences in efficiency, these variants are within the scope of the present invention. 

Thus, mutational, inscrtional, and deletional variants of the disclosed nucleotide 
sequences can be readily prepared by methods which are well known to those skilled in the art. 
These variants can be used in the same manner as the exemplified primer sequences so long as 
the variants have substantial sequence homology with the original sequence. As used herein, 
substantial sequence homology refers to homology whichjs sufficient to enableflie variant to 
function in the same capacity as the original probe. Preferably, tiiis homology is greater than 
50%; more preferably, this homology is greater than 75%; and most preferably, this homology 
is greater than 90%. The degree of homology needed for the variant to function in its intended 
capacity will depend upon the intended use of the sequence. It is well within the skill of a 
person trained in this art to make mutational, inscrtional, and deletional mutations which are 
designed to improve the function of the sequence or otherwise provide a methodological 
advantage. 

pr!R technoloBv , Polymerase Chain Reaction (PGR) is a repetitive, enzymatic, primed 
synthesis of a nucleic acid sequence. This procedure is well known and commonly used by 
those skilled in this art (see Mullis, U.S. Patent Nos. 4,683,195, 4,683,202, and 4,800,159; Saiki, 
Randall K., Stephen Schart Fred Faloona, Kaiy B. Mullis, Glenn T. Horn, Henry A. Erlich, 
Norman Amheim [1985] "Enzymatic Amplification of P-Globin Genomic Sequences and 
Restriction Site Analysis for Diagnosis of Sickle Cell Anemia," Science 230:1350-1354.). PCR 
is based on the enzymatic amplification of a DNA fragment of interest that is flanked by two 
oligonucleotide primers that hybridize to opposite strands of the target sequence. The primers 
are oriented wifli the 3' ends pointing towards each other. Repeated cycles of heat denaturation 
of the template, annealing of the primers to their complementary sequences, and extension of 
the annealed primers with a DNA polymerase result in the amplification of the segment defined" 
by the 5' ends of the PCR primers. Since the extension product of each primer can serve as a 
template for the other primer, each cycle essentially doubles the amount of DNA fragment 
produced in the previous cycle. This results in the exponential accumulation of the specific 
target fragment, up to several million-fold in a few hours. By using a thermostable DNA 
polymerase such as Tag polymerase, which is isolated from the thennophilic bacterium Thermus 
aquaticus, the amplification process can be completely automated. 

The DNA sequences of the subject invention can be used as primers for PCR 
amplification. In performing PCR amplification, a certain degree of mismatch can be tolerated 



wo 99/33991 PCT/US9JW6585 

26 

between primer and template. Therefore, mutations, deletions, and insertions (especially 
additions of nucleotides to the 5' end) of flie exemplified primers fall within the scope of the 
subject invention. Mutations, insertions and deletions can be produced in a given primer by 
methods known to an ordinarily skilled artisan. It is important to note that the mutational, 
5 insertional, and deletional variants genaated from a given primw sequence may be more or less 
efficient than the original sequences. Notwithstanding such differences in efficiency, these 
variants are within the scope of the present invention. 

Following are examples wWchlUustrate {HWiedures for practicing the invention. These 
10 examples djould not be construed as limiting. All percentages are by weight and all solvent 
mixture proportions are by volume unless otherwise noted. 

T.vample 1 - Culturin y of B.t Isolatps TTseflil According to the Invention 

A subculture of B.t. isolates, or mutants thereof can be used to inoculate the following 
IS peptone, glucose, salts medium: 



Bacto Peptone 7.5 g/1 

Glucose 1.0 g/1 

KH2PO4 3.4 g/1 

K2HPO4 4.35 g/1 

20 Salt Solution S.Oml/l 

CaClj Solution 5.0 vM 
pH7.2 



Salts Solution (100 ml) 

25 MgS04-7H20 2.46 g 

MnS04-H20- 0.04 g 

ZnSOvVHjO 0.28 g 

FeS04-7H20 0.40 g 



30 CaClj Solution (100 ml) 
CaCV2H20 



3.66 g 
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The salts solution and CaClj solution are filter-sterilized and added to the autoclaved 
and cooked broth at the time of inoculation. Flasks are incubated at SO^'C on a rotary shaker at 
200 ipm for 64 hr. 

The above procedure can be readily scaled up to large fcrmcntors by procedures well 
5 known in the art. 

The BJ, spores and/or crystals, obtained in the above fermentation, can be isolated by 
procedures well known in the art. A frequently-used procedure is to subject the harvested 
fermentation broth to separation techniques, eg, , centrifugation. 

AltOTiatively, a subculture ofBX isolates, or mutants thereof, can be used to inoculate 
10 the following medium, known as TB broth: 

Tiyptone 12 g/1 

Yeast Extract 24 g/1 

Glycerol 4 g/1 

KH2PO4 2.1 g/1 

15 K2HPO4 14.7 g/1 

pH7.4 

The potassium phosphate was added to the autoclaved broth after cooling. Flasks were 
incubated at 30**C on a rotaiy shaker at 250 rpm for 24-36 hours. 
20 The above procedure can be readily scaled up to large fermentors by procedures well 

known in the art. 

The B.t obtained in the above fermentation, can be isolated by procedures well known 
in the art. A frequently-used procedure is to subject the harvested fermentation broth to 
separation techniques, e.g., centrifugation. In a specific embodiment, B.t. proteins usefiil 
25 according the present invention can be obtained fix)m the supernatant. The culture supernatant 
~ containing the active protein(s) was used in bioassays as discussed below. 

F.xample 2 - Identification of Gen es Encoding Novel Lepidopteran-Activc Bacillm thunn^^U 
Toxins 

30 Two primer pairs usefiil for the identification and classification of novel toxin genes by 

PGR amplification of polymorphic DNA fragments near the 3' ends oiB.t toxin genes were 
designed. These oligonucleotide primers allow the discrimination of genes encoding toxins in 
fee Cry7, Cry8, or Cry9 subfamilies from genes for the more common lepidopteran-active toxins 
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in the Cryl subfenrily based on size differences for the amplified DNA. Tbt sequences of these 
primers are: 

Forward I 5' CXjTGGCTATATCCTTCGTGTYAC 3' (SEQ ID NO. 1) 
Reverse 1 5' ACRAraAATGTTCCITCYGTTTC 3' (SEQ ID NO, 2) 
Forward 2 5' GGATATCTMTTACGTGTAACWGC 3' (SEQ ID NO. 3) 
Reverse 2 5' CTACACnTCrATRTrGAATOYACCTrC 3' (SEQ ID NO. 4) 

Standard PCR amplificatiaii (Perkin Elmer. Foster City, CA) using primer pair I (SEQ 
IDNQS. 1 and 2) ofthe subject invention yields DNA fragments approximately 415-440 base 

pairs in length from iB./. toxin genes related to *e cryl subfamily. 

PCR amplification using primer pair 2 (SEQ ID NOS. 3 and 4) according to the subject 
invention yields DNA fragments approximately 230-290 base pairs in lengfti from cry7, cry8, 
or cryi* subfiimily toxin genes. 

These primers can be used according to the subject invention to identify genes encoding 
novel toxins. Crude DNAtemplates for PCRwerc prepared from B./. strains. Aloopfiilofcells 
was scraped from an overnight plate culture of Bacillus tkuringiensis and rcsuspended in 300 
nd'rebuffer(10mMTris-Cl, 1 mMEDTA,pH 8.0). Proteinase K was added to 0.1 mg/ml and 
the cell suspension was heated to SS'C for 15 minutes. The suspension was then boiled for 15 
minutes. Cellular debris was pelleted in a microfiige and the supernatant containing the DNA 

was transferred to a clean tube. 

PCR was caiTied out using the primer pair consisting of the Forward 2 (SEQ ID NO. 3) 
and Reverse 2 (SEQ ID NO. 4) oUgonucleotides described above. Strains were identified that 
contained genes characterized by amplification of DNA fragments approximately 230-290 bp 
in lengfli. Spore-ctyslal preparations from these strains were subsequently tested for bioactivity 
against Agrotis ipsilon and additionsd lepidopteran targets. 

PS185U2 was examined using both primer pairs 1 and 2 (SEQ ID NOS. 1 and 2 and 
SEQ ID NOS. 3 and 4, respectively). In this strain, primer pair 1 (SEQ ID NOS. 1 and 2) 
yielded a DNA band ofthe size expected for toxin genes related to the cryl subfamily. 

l^.f f.r . „ ^y.f,. f> Tnyin Genes Present in T^idofftftran-ActiYg Strains 

Total cellular DNA was prepared fit)mBaaV/Msrt«"«gie«sis (B.t.) strains grown to an 
optical density, at 600 nm, of 1.0. Cells were pelleted by centrifiigation and resuspended in 
protoplast buffer (20 mg/ml lysozyme in 03 M sucrose. 25 mM Tris-Cl [pH 8.0], 25 mM 
EDTA). After incubation at 37»C for 1 hour, protoplasts were lysed by two cycles of freezing 
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and thawing. Nine volumesofasoluti<mof0.1MNaa0.1%SDS,04MTris-a were 
to complete lysis. The cleared lysate was extracted twice with phenolxhloroform (1:1). Nucleic 
acids were precipitated with two volumes of ethanol and pelleted by centrifugation. TTie pellet 
was resuspended in -re buffer and RNase was added toafmal concentration of 50 g/ml. After 

incubation at 37'C for 1 hour, &e solution was extracted once each with phenolxhloroform 
(11) and TE-saturated chloroform. DNA was precipitated from the aqueous phase by the 
addition of one-tenth volume ofSMNaOAc and two volumes of ethanol. DNAwaspeUetedby 

centrifogatioB. washed with 70% ethanol. dried, and resuspendedin TE buffer. 
— Two types of PCR-amphfied. «P-labeled DNA probes were used in standard Southern 
hybridizationsoftotalcellularA/.DNAtocharacterizetoxingenesbyRFLP. The first probe 

(A) was a DNA fragment amplified using the following primers: 

•Forward 3: 5' CCAGWTITAYAGGAGG 3' (SEQ IDNO. 5) 

Reverse 3:5' GTAAACAAGCTCGCCACCGC 3' (SEQ ID NO. 6) 

TT,esecondprobe(B)wa8 either the 230.290 bp or415-440 bp DNA fragment amplified 

wifli the primers described in the previous example. 

Hybridization of immobUized DNA on Southern blots with the aforementioned 
s»p.labeled probes was perfomiedby standard methods (Maniatis.T..EJ^ 
[im]MolecularChnmg: AlMbonaoryManual,CoU^ 

Harijor, NY). In general, hybridization and subsequent washes were carried out under moderate 
stringency. For double-stranded DNA gene probes, hybridization was carried out overnight at 
20.25'C below the melting temperature (Tm) of the DNA hybrid in 6X SSPE. 5X Denhardt's 
solution, 0.1% SDS, 0.1 mg/ml denatured DNA. THe meltmg temperature is described by the 
following formula (Beltz. GA.. K.A. Jacobs. T.H. EicKbush. P.T. Cherbas. and F.C. Kafatos 
[mi]TnMethodsinEnzynu>hgy,lLm,UQrossm^ 

New Yoik. 100:266-285): 

Tin=81.5X+16.6Ix)g(Na+]+0.41(%GK:)-0.61(%formamide)-600Aengthof duplex 

in base pairs. 

Wadies were typically carried out as follows: 

(1) Twice at room temperature for 15 minutes in IX SSPE, 0.1% SDS (low stringency 

wash). 

(2) Once at Tm -20X for 15 minutes in 0.2X SSPE, 0.1% SDS (moderate stringency 

wash). 
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RFLP data was obtained for the ten strains most active on Agrotis ipsUon (Tables 3 and 
4). -Die hybridizing DNA bands described here contain all or part of the novel toxin genes under 
investigation. 
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fyj ^mplft 4 - D NA Sequencing of Toxin GtoCS 

PCR-amplified segments of toxin genes present in B.t. strains active on Agrotis ipsilon 
were sequenced. To accomplish this, amplified DNA fragments obtained using primers Forward 
3 (SEQ ID NO. 5) and Reverse 3 (SEQ ID NO. 6) were first cloned into the PGR DNA 
TA-cloning plasmid vector, pCRH, as described by the supplier (Invitrogen, San Diego, CA). 
Several individual pCRE clones fix>m the mixture of ampUfied DNA fragments from each B.t. 
strain were chosen for sequencing. Colonies were lysed by boiling to release crude plasmid 
DNA. DNA templates for automated sequencing were ampUfied by PCR using vector-specific _ 
primers flanking the plasmid multiple^sloning sites, thrae DNA templates were sequenced 
using Applied Biosystcms (Foster Qty, CA) automated sequencing methodologies. Toxin gene 
sequences and their corresponding nucleotide sequences, described below (SEQ ID NO. 7 
through SEQ ID NO. 62), were identified by this method. These sequences are listed in Table 
5. The polypeptide sequences deduced from these nucleotide sequences are also shown. 

From these partial gene sequences, seven oligonucleotides usefiil as PGR primers or 
hybridization probes were designed. The sequences of these oligonucleotides are the following: 
S'GTTCATIGGTATAAGAGTTGGTG 3' (SEQ ID NO. 63) 

5'CCACTGCAAGTCCGGACCAAATTCG 3' (SEQ ID NO. 64) 

5'GAATATATTCCCGTCYATCTCTGG 3' (SEQ ID NO. 65) 

5'GCACGAATTACTGTAGCGATAGG 3' (SEQ ID NO. 66) 

5'GCTGGTAACnTGGAGATATGCGTG 3' (SEQ ID NO. 67) 

S'GATTTUnTGTAACACGTGGAGG 3' (SEQ IDNO. 68) 

5'CACTACTAATCAGAGCGATCTG 3' (SEQ ID NO. 69) 

Specific gene toxin sequences and the oligonucleotide probes that enable identification 

of these genes by hybridization, or by PCR in combination with the Reverse 3 primer described 

above, are listed in Table 5 . 
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Table 5. Sequence ID reference numbers 



Strain 


Toxin 


Peptide 


Nucleotide 


Probe used 


PSllB 


IIBIAR 
IIBIBR 


SEQIDNO. 7 
SEQIDNO. 9 


SEQIDNO. 8 
SEQIDNO. 10 


SEQIDNO. 65 


BB129 


1291 A. 
1292A 


SEQIDNO. 11 
SEQIDNO. 13 


SEQIDNO. 12 
SEQIDNO. 14 


SEQ ID NO. 63 
SEQIDNO. 64 




1292B 


SEQIDNO. 15 


SEQIDNO. 16 




_PS31G1 


31GA 
31GBR 


SEQIDNO. 17 — 
SEQTONO. 19 


SEQIDNO. 18 
SEOIDNO.20- 


SEQ ID NO. 65 

■ 


PS185U2 


85N1R 
85N2 

85N3 


SEQIDNO. 21 
SEQIDNO. 23 
SEQIDNO. 25 


SEQIDN0.22 
SEQIDNO. 24 
SEQIDNO. 26 


SEQIDNO. 66 


PS86V1 


86V1C1 
86V1C2 
86V1C3R 


SEQIDNO. 27 
SEQIDNO. 29 

SEQIDNO. 31 


SEQIDNO. 28 
SEQIDNO. 30 
SEQIDNO. 32 


SEQIDNO. 68 
SEQ ID NO. 64 

SEQIDNO. 69 


HDS25 


F525A 
F525B 


SEQIDNO. 33 
SEQIDNO. 35 


SEQIDNO. 34 
SEQIDNO. 36 


SEQ ID NO. 64 
SEQ ID NO. 63 




F525C 


SEQIDNO. 37 


SEQIDNO. 38 




HD573 


F573A 
F573B 

F573C 


SEQIDNO. 39 
SEQIDNO. 41 
SEQIDNO. 43 


SEQIDNO. 40 
SEQIDNO. 42 
SEQIDNO. 44 


SEQIDNO. 63 
SEQIDNO. 67 
SEQ ID NO. 64 


PS86BB1 


FBBIA 
FBBIBR 
FBBIC 
FBBID 


SEQIDNO. 45 
SEQIDNO. 47 
SEQIDNO. 49 
SEQIDNO. 51 


SEQIDNO. 46 
SEQIDNO. 48 
SEQIDNO. 50 
SEQIDNO. 52 


SEQIDNO. 68 
SEQ ID NO. 69 
SEQ ID NO. 64 
SEQIDNO. 63 


PS89J3 


J31AR 
J32AR 


SEQIDNO. 53 
SEQIDNO. 55 


SEQIDNO. 54 
SEQIDNO. 56 


SEQIDNO. 68 
SEQ ID NO. 64 


FS86W1 


WIFAR 
WIFBR 
"WIFC 


SEQIDNO. 57 
SEQIDNO. 59 
SF.OTnN0.61 


SEQIDNO. 58 
SEQIDNO. 60 

SF.O ID NO. 62 


SEQIDNO. 68 
SEQIDNO. 69 
^SEOroNO^^^ 



pv ^mple. - Tso lftinn DNA Seniiencing of FwlH CTgth Toxin <JffncS 

Total cellular DNA was extracted fiom B.t. strains using standard procedures known in 
the art See. e.g.. Example 3, above. Gene libraries of size-ftactionated 5a«3A partial 
restriction fragments of total cellular DNA were^constructcd in the bacteriophage vector, 
Lambda-Gemll. Recombinant phage were packaged and plated on E. coli KW251 cells. 
Plaques were screened by hybridization with radiolabeled gene-specific probes derived from 
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10 



15 



20 



35 

DNA fragments PCR-amplified with oligonucleotide primers SEQ ID NOS. 5 and 6. 
Hybridizing phage were plaque-purified and used to infect liquid cultures of £. coli KW251 
cells for isolation of DNA by standard procedures (Maniatis, T.. EJ. Fritsch, J. Sambrook 
[1982] Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory. Cold Spring 
Harbor, NY). Toxin genes were subsequently subcloned into pBluescipt vectors (Stratagene) 
for DNA sequence analysis. 

The full-length toxin genes listed below were sequenced using Applied Biosystems 
(Foster City.CA) automated sequencing methodologies. Hie toxin gene sequences and the 
respective predicted polypeptide"scquences are listed below. 



Source Strain 


Peptide SEQ ID 


Nucleotide SEQ ID 


Toxin designation 


PS86BB1 
PS86BB1 
PS31G1 


SEQIDNO.70 

SEQIDNO. 72 
SEQ ID NO. 74 


SEQIDNO. 71 
SEQIDNO. 73 
SEQIDNO. 75 


86BBl(a) 
86BBl(b) 
31Gl(a) 


Recombinant E. coli NM522 strains containing these plasmids encoding these toxins were 


deposited with NRRL 


on June 27, 1997. 






Strain 


Plasmid 


Toxin designation 


NRRL number 


MR922 
MR923 

MR924 


pMYC2451 
pMYC2453 
OMYC2454 


86BBl(a) 
86BBl(b) 
31GUa) 


B-21794 
B-21795 
B-21796 



25 



30 



Full-length toxin genes were engineered into plasmid vectors by standard DNA cloning 
mefliods,andtansfomiedintoPs«a/omonas/7o«mcc/wforexpressio Recombinant bacterial 
strains (Table 6) were grown in shake flasks for production of toxin for expression and 
quantitative bioassay against a variety of lepidopteran insect pests. 

^^fSSSottSnasJluoSns^s^ /or ketcrologous expression oi 









novel toxins 






Source Strain 


Plasmid 


Toxin 


Recombinant P.f. Strain 


35 


PS86BB1 
PS86BB1 
^PS31G1^ 


pMYC2804 
pMYC2805 


86BBl(a) 
86BBl(b) 
^^31GUa^^ 


MR1259 
MR1260 
^^^^mi264 ^i^ 
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f .y am ple 7 - Processing of Endotoxins with Trypsin 

Cultures of Pseudomonas fluorescens were grown for 48 hrs. as per standard procedures. 
Cell pellets wctc harvested by centrifugation and washed three times with water and stored at 
-70**C. Endotoxin inclusions were isolated from cells treated with lysozyme and DNAse by 
differential centrifugation. Toxins isolated in this manner were then processed to limit peptides 
by trypsinolysis and were then used for bioassays on lepidopteran pests. 

Detailed protocols follow. Toxin inclusion bodies were prepared from the washed crude 
cell pellets as follows: — - - 

4L of Lysis Buffer (pr^-e day of use) 

gm 

Trisbase 2422 
Naa 46.75 
Glycerol 252 
Dithiothrcitol 0.62 
EDTA Disodium salt 29.78 
Triton X-100 20 mis 

Adjust pH to 7.5 with HCl and bring up to final volume (4L.) with distilled water. 

1 . Tliaw frozen cell pellet in 37'*C water bath. 

2. Add the lysis buffer until the 500 ml polycarbonate centrifuge bottles are as full 
as possible -400 ml total volume. Disperse by inversion of the bottle or using 
the Polytron at low rpm. 

3. Centrifuge (10,000 x g) for 20 minutes at 4X. 

4. Decant and discard supernatant 

5. Resuspend pellet in 5 ml of lysis buffer for every gram of pellet, using flie 
Polytron at low rpm to disperse the pellet. 

6. Add 25 mg/ml lysosyme solution to the suspaision to a final concentration of 
0.6 mg/ml. 

7. Incubate at 37'*C for 4 minutes. Invert every 30 seconds. 

8. Place suspension on ice for 1 hour. 

9. Add 2.5M MgCl-eHjO to the tubes to a final concentration of 60 mM. Add a 
40 mg/ml deoxyribonuclease I (Sigma) solution to get a final concentration of 
0.5 mg/ml. 

10. Incubate ovemight at 4*'C. 
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1 1 . Homogenize the lysate using the Polytron at low rpm. 

12. Centrifuge at 10,000g at 4**C for 20 minutes. Decant and discard supernatant. 

13. Resuspend the inclusion pellet in lysis buffer. Check microscopically for 
complete cell lysis. 

14. Wash the inclusion pellet in lysis buffer 5 times (repeat steps 2-5). 

15. Store as a suspension of 10 mM Tris-Cl pH 7.5, 0.1 mM PMSF and stored at 
-70**C in 1.5 ml Eppitubes. 

Digestion of inclusions with trypsin is performed as follows: 
Digestion solution: 

1. 2mllMNaCAPSpH10.5 

2. Inclusion preparation (as much as 100 mg protein) 

3 . Trypsin at a 1 : 1 00 ratio with ttie amount of protein to be cleaved (added during 
the procedure) 

4. HjOtoafmal volume of 10 ml 

Trypsin treatment is performed as follows: 

1. Incubate the digestion solution, minus trypsin, at 37''C for 15 minutes. 

2. Add trypsin at 1 : 1 00 (trypsin:toxin protein wt/wt) 

3. Incubate solution for 2 hours at 37*'C with occasional mixing by inversion. 

4. Centrifuge the digestion solution for 15 minutes at 15,000g at 4°C. 

5. Remove and save the supernatant 

6. Supernatant is analyzed by SDS-PAGE and used for bioassay as discussed 
below. 

Example 8 - Eitpi^ion of a Gene from B.t strain HD129 in a Chimeric Construct 

A gene was isolated from Bx strain HD129. This gene appears to be a pseudogene wifli 
no obvious translational initiation codon. To express this gene from HD129, we designed and 
constructed a gene fusion with the first 28 codons of cry 1 Ac in Pseudomcnas expression system. 
The nucleotide and peptide sequences of this chimeric toxin are shown in SEQ ED NOS. 76 and 
77. Upon induction, recombinant Rfluorescens containing this novel chimeric toxin ejqjressed 
the polypeptide of the predicted size. 
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Fvample 9 - F urther Sequencing of Toxin Genes 

DNA of soluble toxins fixMn the isolates listed in Table 7 were sequenced. The SEQ ID 
NOS. of the sequences thus obtained are also reported in Table 7. 



Table 7, 



Source Isolate 


Protein SEQ ID NO. 


Nucleotide SEQ ID NO. 


Toxin 
Name 


PSllB 


78 


79 


llB(a) 


PS31G1 


80 


81 


31Gl(b) 


PS86BB1 


82 


83 


86BBl(c) 


PS86V1 


84 


85 


86Vl(a) 


PS86W1 


86 


87 


86Wl(a) 


PS94R1 


88 


89 


94Rl(a) 


PS185U2 


90 


91 


185U2(a) 


PS202S 


92 


93 


202S(a) 


PS213E5 


94 


95 


213E5(a) 


PS218G2 


96 


97 


218G2(a) 


HD29 


98 


99 


29HD(a) 


HDllO 


100 


101 


110HD(a) 


HD129 


102 


103 


129HD(b) 


HD573 


104 


105 


573HD(a) 



Examnle 10 - Black Cutwonn Bioassav 

Suspensions of powders containing BX isolates were prepared by mixing an appropriate 
amount of powder with distilled water and agitating vigorously. Suspensions were mixed with 
black cutwonn artificial diet (BioServ, Frenchtown, NJ) amended with 28 grams alfelfa powder 
(BioServ) and 1.2 ml formalin per liter of finished diet. Suspensions were mixed with finished 
artificial diet at a rate of 3 ml suspension plus 27 ml diet. After vortexing, this mixture was 
poured into plastic trays with compartmentalized 3 ml wells (Nutrend Container Corporation, 
Jacksonville, FL). A water blank contaming no B.L served as the control. Early first-instar 
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Agrotis ipsilon larvae (French Agricultural Services, Lamberton, MN) were placed singly onto 
the diet mixture. Wells were then sealed with "MYLAR" sheeting (ClearLam Packaging, IL) 
using a tacking iron, and several pinholes were made in each well to provide gas exchange. 
Larvae were held at 29°C for four days in a 14:10 (light:dark) holding room. Mortality was 
5 recorded after four days. 

The following B.l isolates were found to have activity against black cutworm: 
PSl 85U2, PSl IB, PS218G2, PS213E5, PS86WI, PS28C, PS86BB1, PS89J3, PS86V1, PS94R1, 
HD525, HD573, PS27J2, HDllO, HDIO, PS202S, HD29, PSIOIDD, HD129, and PS31G1. 
Bioassay results are shown in Table 8. — 

10 

Table 8. Percentage black cutworm mortality associated with BA. isolates 



Estimated toxin concentration (^g toxin/mL diet) 



Sample 


200 


100 


50 


25 


PS86BB1 


51 


25 


9 


1 


PS31G1 


30 


20 


7 


5 


PSl IB 


37 


16 


3 


0 


HD573 


11 


13 


3 


0 


HDI29 


87 


73 


43 


7 


PS86V1 


.73 


29 


19 


3 


PS89J3 


68 


27 


15 


3 


PS86W1 


61 


23 


12 


15 


PS185U2 


69 


32 


14 


16 


HD525 


67 


20 


11 


4 


water control 


! 









25 

Example 1 1 - Activity of B.t Isolates Against Agrotis ipsilon 

Strains were tested as supernatant cultures. Samples were applied to black cutworm 
artificial diet (BioServ, Frenchtown, NJ) and allowed to air dry before larval infestation. A 
water blank containing no BA. served as the control. Eggs were applied to each treated well and 
30 were then sealed with "MYLAR" sheeting (QearLam Packaging, IL) using a tacking iron, and 
several pinholes were made in each well to provide gas exchange. Bioassays were held at 25 "^C 
for 7 days in a 14:10 (lightrdark) holding room. Mortality was recorded after seven days. 
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Strains exhibiting mortality against ^4. ipsUon (greater than water control) are reported in Table 
9. 



Table 9. Larvacidal activity ofBJ. concentrated supematants in a top load bioassay on 

A. ipsilon neonates 





Strain 


Activity 




PS86Wr 


+ 




PS28C 


+ 




PS86BBI 


+ 




PS89J3 


+ 




PS86V1 


+ 




PS94R1 


+ 




HD573 


+ 



Rxample 12 - Activity of B.t Isolates Pseudomonas fluorescens Clones Against Heliothis 
virescens (Fah ridus^ and HelicaverDa tea fBoddie) 

Strains were tested as cither frozen Pseudomonas fluorescens clones or BA. supernatant 
culture samples. Suspensions of clones were prepared by individually mixing samples with 
distilled water and agitating vigorously. For diet incorporation bioassays, suspensions were 
mixed with the artificial diet at a rate of 6 mL suspension plus 54 mL diet After vortexing, this 
mixture was poured into plastic trays with compartmentalized 3-ml wells (Nutrend Container 
Corporation, Jacksonville, FL). Supernatant samples were mixed at a rate of 3-6 ml with the 
diet as outlined above. In top load bioassays, suspensions or supematants were applied to the 
top of file artificial diet and allowed to air dry before larval infestataion. A water blank served 
as the control. First instar larvae (USDA-ARS, Stoneville, MS) were placed singly onto flie diet 
mixture. Wells were then sealed with "MYLAR" sheeting (ClearLam Packaging) using a 
tacking iron, and several pinholes were made in each well to provide gas exchange. Larvae were 
held at 25 **C for 6 days in a 14: 10 (Hght:dark) holding room. Mortality was recorded after six 
days. 
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Results are as follows: 



Table 10. Larvacidal activity ofBJ. concentrated supematants in a top load bioassay 
Total Protein H. virescens ^- 





( 1 1 <y/cnfiM 


% Mortality 


Stunting 


% Mortality 


Stunting 


HD129 


44.4 


100 


yes 


50 


yes 




44.4 


81 


yes 


50 


yes 




47.6 


100 


yes 


36 


no 


PS185U2 


23.4 


100 


yes 


100 


yes 




23.4 


100 


yes 


95 


yes 




21.2 


100 


yes 


96 


yes 




212 






100 


yes — -- 


PS31G1 


8.3 


70 


yes 


39 


yes 




8.3 


17 


yes 


30 


yes 




3.6 


29 


yes 


30 


yes 




3.6 






0 


no 



10 



15 



20 



Table 11. Strains tested in diet incorporation bioassay on H. virescens and H. zea 

H. zea 



Strain 



PSUB 
PS185U2 
PS31G1 
PS86BB1 
PS86V1 
PS86W1 
PS89J3 
HD129 
HD525 
HD573A 



H. virescens 



Total protein 
(Hg/ml diet) 



NA' 
55 
0 

23.3 
17 
18 
13 

NA 
3 
3 



% Mortality 



45 

100 

50 

100 

100 

100 

100 

100 

96 

96 



Total protein 
(^g/ml diet) 



268 

55 
43.4 
23.3 
- 17 

18 

13 
138.3 
171.7 
78.3 



% Mortality 



96 

100 

13 

100 

92 

83 

81 

13 

0 

21 



25 



'Protein inforaiatioii not available. 
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Table 12. K virescens dose response in diet incorporation bioassays using frozen spore 

crystal preparations 



MM 






LC50 (|ig/ml) 


1259 






13.461 


1259tiypsiTi 






1.974 


1260 






12.688 


1260 trypsin 






0.260 _ _ 


1264 






95.0 


1264 trypsin 






2.823 



Exampk 13 - Activity Against Q^tHnio^ mbiMi^ fEwopgan Com Bprgr) 

Isolates and toxins of the subject invention can be used to control Ostrinia nubilaliSj the 
1 5 European com borer (ECB). Activity against ECB can be readily ascertained by, for example, 

standard artificial diet incorporation insect bioassay procedures, iising, for example, first instar 
larvae. In a specific embodiment, trypsin-treated clones expressing the 31Gl(a) gene were 
found to have an LCSO value of 0.284 (^g/ml). 

20 Example |4 - Inggrtion of Toxin Ggnes Into PlamtS 

One aspect of die subject invention is the transformation of plants with genes encoding 
the insecticidal toxin. The transformed plants are resistant to attack by the target pest. 

Genes encoding pesticidal toxins, as disclosed herein, can be inserted into plant cells 
using a variety of techniques which are well known in the art. For example, a large number of 

25 cloning vectors comprising a replication system in E, coli and a marker that permits selection 

of the transformed cells are available for preparation for the insertion of foreign genes into 
higher plants. The vectors comprise, for eicample, pBR322, pUC series, M13mp series, 
pACYC184, etc. Accordingly, the sequence encoding the B,t toxin can be inserted into the 
vector at a suitable restriction site. The resulting plasmid is used for transformation into £. a}lL 

30 The E. coli cells are cultivated in a suitable nutrient medium, then harvested and lysed. The 
plasmid is recovered. Sequence analysis, restriction analysis, electrophoresis, and other 
biochemical-molecular biological methods are generally carried out as methods of analysis. 
After each manipulation, the DNA sequence used can be cleaved and joined to the next DNA 
sequence. Each plasmid sequence can be cloned in the same or other plasmids. Dq}endingon 
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the method of inserting desired genes into the plant, other DNA sequences may be necessary. 
If, for example, the Ti or Ri plasmid is used for the transformation of the plant cell, then at least 
the right border, but often the right and the left border of the Ti or Ri plasmid T-DNA, has to 
be joined as the flanking region of the genes to be inserted. 

5 The use of T-DNA for the transformation of plant cells has been intensively researched 

andsufficicntly described in EP 120 516; Hoekema (1985) hi: TheBmary Plant Vector System, 
Offset-durkkerij Kanters B.V., Alblasserdam, Chapter 5; Fraley et al. Crit Rev, Plant Scl 4:1- 

46; andAn al (1985) EMBOJ, Aan-lH, 

Once the inserted DNA has been integrated in the genome, it is relatively stable there 

10 and, as a rule, does not come out again. It normally contains a selection marker that confers on 
the transformed plant cells resistance to a biocide or an antibiotic, such as kanamycin, G 418, 
bleomycin, hygromycin, or chloramphenicol, inter alia. The individually employed marker 
should accordingly permit the selection of transformed cells rather than cells that do not contain 
the inserted DNA. 

15 A large number of techniques are available for inserting DNA into a plant host cell. 

Those techniques include transformation with T-DNA using Agrobacterium tumefaciens or 
Agrobacterium rhizogenes as transformation agent, ftision, injection, bioUstics (microparticle 
bombardment), or electroporation as well as other possible methods. If Agrobacteria are used 
for the transformation, the DNA to be inserted has to be cloned into special plasmids, namely 

20 either into an intermediate vector or into a binary vector. The intermediate vectors can be 
integrated into the Ti or Ri plasmid by homologous recombination owing to sequences that are 
homologous to sequences in the T-DNA. The Ti or Ri plasmid also comprises the vir region 
necessary for the transfer of the T-DNA. Intermediate vectors cannot replicate themselves in 
Agrobacteria. The intermediate vector can be transferred into Agrobacterium tumefaciens by 

25 means of a helper plasmid (conjugation). Binary vectors can replicate themselves both in E. coli 
and in AgrobactetiarThey comprise a selection marker gene and a linker or polylinker which 
are framed by the right and left T-DNA border regions. They can be transformed directly into 
Agrobacteria (Holsters a/. [1978] Afo/. Gen, Genet 163:181-187). The ^gro^ac/erm/w used 
as host cell is to comprise a plasmid carrying a vir region. The vix region is necessary for the 

30 transfer ofthe T-DNA into the plant cell. Additional T-DNA may be contained. The bacterium 

so transformed is used for the transformation of plant cells. Plant explants can advantageously 
be cultivated ynih Agrobacterium tumefaciens or Agrobacterium rhizogenes for the transfer of 
the DNA into the plant cell. Whole plants can then be regenerated from the infected plant 
material (for example, pieces of leaf, segments of stalk, roots, but also protoplasts or suspension- 
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cultivated cells) in a suitable medium, which may contain antibiotics or biocides for selection. 
The plants so obtained can then be tested for the presence of the inserted DNA. No special 
demands are made of the plasmids in the case of injection and electroporation. It is possible to 
use ordinary plasmids, such as, for example, pUC derivatives. 
5 The transformed cells grow inside tiie plants in the usual manner. They can form germ 

cells and transmit the transformed trait(s) to progeny plants. Such plants can be grown in the 
normal manner and crossed with plants that have the same transformed hereditary factors or 
. other hereditary factors. The resulting hybrid individuals have the corresponding phenotypic 
properties. — 

10 In a preferred embodiment of the subject invention, plants will be transformed with 

genes wherein the codon usage has been optimized for plants. See, for example, U.S. Patent No. 
5,380,831, which is hereby incorporated by reference. Also, advantageously, plants encoding 
a truncated toxin will be used. The truncated toxin typically will encode about 55% to about 
80% of the full length toxin. Methods for creating synthetic B,t genes for use in plants are 

15 known in the art. 

It should be understood that the examples and embodiments described herein are for 
illustrative purposes only and that various modifications or changes in light thereof will be 
suggested to persons skilled in the art and are to be included within the spirit and purview of this 
20 application and die scope of the appended claims. 
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Claims 



1 1 . A meAod for the conlrol of European com borer (Ostrinia nubilalis), wherein said 

2 mediod conqnises contacting said pest witii a pesticidal amount of & Bacillus thttringiensis toxin 

3 wherein said toxin has a characteristic selected from the group consisting of: 

4 (a) said toxin comprises an amino acid sequence having at least about 75% 

5 homology with a sequence selected from the group consisting of SEQ ID NO. 

6 ______ 70, SEQ ID NO. 72, SEQ ID NO. 74,-SEQ ID NO. 76, SEQ ID NO. 78, SEQ 

7 ~ ID NO. 80, SEQ ID NOr82, SEQ ID NO. 84, SEQ ID NO.-86, SEQ ID NO. 88, 

8 SEQ ID NO. 90, SEQ ID NO. 92, SEQ ID NO. 94, SEQ ID NO. 96, SEQ ID 

9 NO. 98, SEQ ID NO. 100, SEQ ID NO. 102, and SEQ ID NO. 104; 

10 (b) said toxin comprises an amino acid sequence which is encoded by a nucleotide 

1 1 which hybridizes with a nucleotide sequence which encodes an amino acid 

12 sequmce selected from die groiq> c<Hisisting of SEQ ID NO. 70, SEQ ID NO. 

13 72, SEQ ID NO. 74, SEQ ID NO. 76, SEQ ID NO. 78, SEQ ID NO. 80, SEQ 

14 ID NO. 82, SEQ ID NO. 84, SEQ ID NO. 86, SEQ ID NO. 88, SEQ ID NO. 90, 

15 SEQ ID NO. 92, SEQ ID NO. 94, SEQ ID NO. 96, SEQ ID NO. 98, SEQ ID 

16 NO. 100, SEQ ED NO. 102, and SEQ ID NO. 104; and 

1 7 (c) said toxin immunoreacts widi an antibody to a toxin selected from the group 

1 8 consisting of SEQ ID NO. 70, SEQ ID NO. 72, SEQ ID NO. 74, SEQ ID NO. 

19 76, SEQ ID NO. 78, SEQ ID NO. 80, SEQ ID NO. 82, SEQ ID NO. 84, SEQ 

20 ID NO. 86, SEQ ID NO. 88, SEQ ID NO. 90, SEQ ID NO. 92, SEQ ID NO. 94, 

21 SEQ ID NO. 96, SEQ ID NO. 98, SEQ ID NO. 100,SEQIDNO. 102,andSEQ 

22 ID NO. 104. 



1 
2 



2. The method, according to claim 1, wherein said toxin has an amino acid sequence 
^own in SEQ ED NO. 74, or a pesticidal fragment thereof. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Schnepf, H. Ernest 
Wicker, Carol 
Narva, Kenneth E. ^ 
Walz, Michelle 
Stockhoff, Brian 
Muller-Cohn, Judy 

(ii) TITLE OF INVENTION: Toxins Active Against Pests 

(iii) NUMBER OF SEQUENCES: 105 

(iv) CORRESPONDENCE ADDRESS: 

{A) ADDRESSEE: Saliwanchik, Lloyd & Saliwanchik 

(B) STREET: 2421 N.W. 41st Street, Suite A-1 

(C) CITY: Gainesville 

(D) STATE: Florida 

(E) COUNTRY: USA 

(F) ZIP: 32606 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentin 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/886,615 

(B) FILING DATE: l-JXni-1997 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/674,002 

(B) FILING DATE :~r- JUL- 1996 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Sanders, Jay M. 

(B) REGISTRATION NUMBER: 39,355 

(C) REFERENCE/DOCKET NUMBER: MA-701C2 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (352) 375-8100 

(B) TELEFAX: (352) 372-5800 



(2) INFORMATION FOR SEQ ID N0:1: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1 
CGTGGCTATA TCCTTCGTGT YAC 



(2) INFORMATION_FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 
ACRATRAATG TTCCTTCYGT TTC 
(2) INFORMATION FOR SEQ ID N0:3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 
GGATATGTMT TACGTGTAAC WGC 



(2) INFORMATION FOR SEQ ID N0:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 
CTACACTTTC TATRTTGAAT RYACCTTC 
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(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
CCAQWTTTAY AGGAGG 16 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
GTAAACAAGC TCGCCACCGC 20 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 137 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:7: 

Pro Gly Phe Xaa Gly Gly Asp He Leu Arg Arg Thr Ser Pro Xaa Gin 
15 10 15 

He Ser Xaa Leu Arg Val Asn He Thr Ala Pro Leu Ser Gin Arg Tyr 
20 25 30 

Arg Val Arg He Xaa Xaa Ala Ser Thr Thr Xaa Xaa Gin Phe His Thr 
35 40 45 

Ser He Xaa Gly Arg Pro He Asn Gin Gly Asn Phe Ser Xaa Thr Met 
50 55 60 
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Ser Ser Gly Ser Asn Leu Gin Ser 
65 70 



Gly Xaa Phe Arg Thr Val Gly Phe 
75 80 



Thr Thr Pro Xaa Asn Phe Ser Asn 
85 



Gly Ser Ser Val Phe Thr Leu Ser 
90 95 



Xaa His Val Phe Asn Ser Gly Asn 
100 



Glu Val Tyr He Asp Arg He Glu 
105 110 



Phe Val Pro Ala Glu Val Thr Phe 
115 120 



Glu Ala Glu Tyr Asp Leu Glu Arg 
125 



Ala Xaa Lys Ala Val Ala Ser Leu 
130 135 



Phe 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 413 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

CCAGGATTTA YAG6AGGAGA TATTCTTCGA AGAACTTCAC CTGKSCAGAT TTCAWCCTTA 60 

AGAGTAAATA TTACTGCACC ATTATCACAA AGATATCGGG TAAGAATTCR CWACGCTTCT 120 

ACYACAWATT TWCAATTCCA TACATCAATT GRCGGAAGAC CTATTAATCA GGGKAATTTT 180 

TCASCAACTA TGAGTAGTGG GAGTAATTTA CAGTCCGQAA KCTTTAGGAC TGTAGGTTTT 240 

ACTACTCCGT KTAACTTTTC AAATGGATCA AGTGTATTTA CGTTAAGTKC TCATGTCTTC 300 

AATTCAGGCA ATGAAGTTTA TATAGATCGA ATTGAATTTG TTCCGGCAGA AGTAACCTTT 360 

GAGGCAGAAT ATGATTTAGA AAGAGCACMA AAGGCGGTGG CGAGCTTGTT TAC 413 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 136 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
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Pro Gly Phe Thr Gly Gly Asp He Leu Arg Arg Thr Asp Gly Gly Xaa 
15 10 15 

Val Gly Thr He Arg Ala Asn Val Asn Ala Pro Leu Thr Gin Gin Tyr 
20 25 30 

Arg He Arg Leu Arg Tyr Ala Ser Thr Thr Ser Phe Val Val Asn Leu 
35 40 45 

Phe Val Asn Asn Ser Ala Ala Gly Phe Thr Leu Pro Ser Thr Met Ala 
50 55 60 

Gin Asn Gly Ser Leu Thr Xaa Glu Ser Phe Asn Thr Leu Glu Val Thr ---- 

65 70 —"75 80 ^ 

His Xaa He Arg Phe Ser Gin Ser Asp Thr Thr Leu Arg Leu Asn He 
85 90 95 

Phe Pro Ser He Ser Gly Gin Xaa Val Tyr Val Asp Lys Xaa Glu He 
100 105 110 

Val Pro Xaa Asn Pro Thr Arg Glu Ala Glu Glu Asp Leu Glu Asp Xaa 
115 120 125 

Lys Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 410 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQtJENCE DESCRIPTION: SEQ ID NO: 10: 

CCAGGWTTTA CAGGAGGGGA TATACTTCGA AGAACGGaCG GTGGTRCAGT TGGAACGATT 60 

AGAGCTAATG TTAATGCCCC ATTAACACAA CAATATCGTA TAAGATTACG CTATGCTTCG 120 

ACAACAAGTT TTGTTiSTTAA TTTATTTGTT AATAATAGTG CGGCTGGCTT TACTTTACCG 180 

AGTACAATGG CTCAAAATGG TTCTTTAACA YRCGAGTCGT TTAATACCTT AGAGGTAACT 240 

CATWCTATTA GATTTTCACA GTCAGATACT ACACTTAGGT TGAATATATT CCCGTCYATC 300 

TCTGGTCAAG RAGTGTATGT AGATAAACWT GAAATCGTTC CAWTTAACCC GACACGAGAA 360 

GCGGAAGAAG ATTTAGAAGA TSCAAAGAAA GCGGTGGCGA GCTTGTTTAC 410 
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(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 137 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO ill: 

Pro Gly Phe Xaa Gly Gly Asp lie Leu Arg Arg Thr Gly Val Gly Thr 
1 5 10"^" 15 

Phe Gly Thr He Arg Val Arg Xaa Thr Ala Pro Leu Thr Gin Arg Tyr 
20 25 30 

Arg He Arg Phe Arg Phe Ala Xaa Thr Thr Asn Leu Phe He Gly He 
35 40 45 

Arg Val Gly Asp Arg Gin Val Asn Tyr Phe Asp Phe Gly Arg Thr Met 
50 55 60 

Asn Arg Gly Asp Glu Leu Arg Tyr Glu Ser Phe Ala Thr Arg Qlu Phe 
65 70 75 80 

Thr Thr Asp Phe Asn Phe Arg Gin Pro Gin Glu Leu He Ser Val Phe 
85 90 95 

Ala Asn Ala Phe Ser Ala Gly Gin Glu Val Tyr Phe Asp Arg He Glu 
100 105 110 

He He Pro Val Asn Pro Ala Arg Glu Ala Lys Glu Asp Leu Glu Ala 
115 120 125 

Ala Lys Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO : 12 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 413 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
CCAGGTTTTA YAGGAGGGGA TATACTCCGA AGAACAGGGG TTGGTACATT TGGAACAATA 60 



AGGGTAAGCSA YTAtTTGCCCC CTTAACACAA AGATATCGCA TAAGATTCCG TTTCGCTTYT 120 
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ACCACAAATT TGTTCATTGG TATAAGAGTT GGTGATAGAC AAGTAAATTA TTTTGACTTC 180 

GGAAGAACAA TGAACAGAGG AGATGAATTA AGGTACGAAT CTTTTGCTAC AAGGGAGTTT 240 

ACTACTGATT TTAATTTTAG ACAACCTCAA GAATTAATCT CAGTGTTTGC AAATGCATTT 300 

A6CGCTGGTC AAGAAGTTTA TTTTGATAGA ATTGAGATTA TCCCCGTTAA TCCCGCACGA 360 

GAGGCGAAAG AGGATYTAGA AGCAGCAAAG AAAGCGGTGG CGAGCTTGTT TAG 413 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 135 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Gly Phe lie Gly Gly Ala Leu Leu Gin Arg Thr Asp His Gly Ser Leu 
15 10 15 

Gly Val Leu Arg Val Gin Phe Pro Leu His Leu Arg Gin Gin Tyr Arg 
20 25 30 

lie Xaa Val Arg Tyr Ala Xaa Thr Thr Asn lie Arg Leu Ser Val Asn 
35 40 45 

Gly Ser Phe Gly Thr lie Ser Gin Asn Leu Pro Ser Thr Met Arg Leu 
50 55 60 

Gly Glu Asp Leu Arg Tyr Gly Ser Phe Ala lie Arg Glu Phe Asn Thr 
65 70 75 80 

Ser lie Arg Pro Thr Ala Ser Pro Asp Gin lie Arg Leu Thr He Glu 
85 90 95. 

Pro Ser Phe He Arg Gin Glu Val Tyr Val Asp Arg He Glu Phe He 
100 105 110 

Pro Val Asn Pro Thr Arg Glu Ala Lys Glu Asp Leu Glu Ala Ala Lys 
115 - 120 125 

Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 407 base pairs 



180_ 

240 

300 

360 

407 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLCXSY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GGMTTTATAG GAGGAGCtCT ACTTCAAAGG ACTGACCATG GTTCGCTTGG AGTATTGAGG 60 
GTCCAATTTC CACTTCACTT AAGACAACAA TATC6TATTA SAGTCCGTTA TGCTTYTACA 120 
ACAAATATTC GATTGAGTGT GAATCGCA6T TTCGGTACTA TTTCTCAAAA TCTCCCTAGT 
ACAATCAGAT TAGQAGAGGA TrTAAGATAC^TCTTTTG CTATAAGAGA GTTTAATACT 
TCTATTAGAC CCACT6CAAG TCCGQACCAA ATTCGATT6A CAATAGAACC ATCTTTTATT 
AGACAAOAGO TCTATOTAGA TAGAATTGAQ TTCATTCCAQ TTAATCC6AC GCGAGAGGCG 
AAAGAGGATC TAGAAGCAGC AAAAAAAGCG GTGGCGAGCT TGTTTAC 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 137 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Pro Gly Phe Thr Gly Gly Asp He Leu Arg Arg Thr Ser Pro Gly Gin 
1 5 10 " 

He Ser Thr Leu Arg Val Asn He Thr Ala Pro Leu Ser Gin Arg Tyr 
20 25 30 

Arg Val Arg He Arg Tyr Ala Ser Thr Thr Asn Leu Gin Phe His Thr 
35 40 45 

Ser He Asp Gly Arg Pro He Asn Gin Gly Asn Phe Ser Ala Thr Met 
50 55 60 



ser Ser Gly Ser Asn Leu Gin Ser Gly Ser Phe Arg Thr Val Gly Phe 
65 70 75 80 

Thr Thr Pro Phe Asn Phe Ser Asn Gly Ser Ser Val Phe Thr Leu Ser 
85 90 95 

Ala His val Phe Asn Ser Gly Asn Qlu Val Tyr He Asp Arg He Glu 
100 105 HO 



wo 99/33991 



PCTAJS98/26585 



9 

Phe Val Pro Ala Glu Val Thr Phe Glu Ala Glu Tyr Asp Leu Glu Arg 
X15 120 125 

Ala Gin liys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 413 base pairs 

(B) TYPE: nucleic acid 
_tC)- STRANDEDNESS : single ^ 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
CCAGGATTTA CAGGAGGAGA TATTCTTCGA AGAACTTCAC CTGGCCAGAT TTCAACCTTA 
AGAGTAAATA TTACTGCACC ATTATCACAA AGATATCGGG TAAGAATTCG CTACGCTTCT 
ACCACAAATT TACAATTCCA TACATCAATT GACGGAAGAC CTATTAATCA GGGGAATTTT 
TCAGCAACTA TGAGTAGTGG GAGTAATTTA CAGTCCGGAA GCTTTAGGAC TGTAGGTTTT 
ACTACTCCGT TTAACTTTTC AAATGGATCA AGTGTATTTA CGTTAAGTGC TCATGTCTTC 
AATTCAGGCA ATGAAGTTTA TATAGATCGA ATTOAATTTG TTCCGGCAGA AGTAACCTTT 
6AGGCAGAAT ATGATTTAGA AAGAGCGCAA AAGGCGGTGG CGAGCTTGTT TAC 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 136 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Pro Gly Phe Xaa Gly Gly Asp He Leu Arg Arg Thr Asp Gly Gly Ala 
15 10 15 

val Gly Thr He Arg Ala Asn Val Asn Ala Pro Leu Thr Gin Gin Tyr 
20 25 30 

Arg He Arg Leu Arg Tyr Ala Ser Thr Thr Ser Phe Val Val Asn Leu 
35 40 45 



60 
120 
180 
240 
300 
360 
413 
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Phe val Asn Asn Ser Ala Ala Gly Phe Thr Leu Pro Ser Thr Met Ala 



50 



55 60 



Gin Asn Gly Ser Leu Thr Tyr Glu Ser Phe Asn Thr Leu Glu Val Thr 
65 



70 75 80 



His Thr He Arg Phe Ser Gin Ser Asp Thr Thr Leu Arg Leu Asn He 
85 90 95 

Phe Pro Ser He Ser Gly Gin Glu Val- Tyr Val Asp Lys Leu Glu He 
100 105 110 

Val Pro He-Asn Pro Thr Arg Glu Ala Glu Glu-Asp Leu Glu Asp Ala 
- 115 120 125 _ 

Lys Lys Ala Val Ala Ser Leu Phe 
130 135 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 410 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
CCAGGWTTTA YAGQAGGGGA TATACTTCGA AGAACGGACG GTGGTGCAGT TGGAACGATT 
AGAGCTAATG TTAATGCCCC ATTAACACAA CAATATCGTA TAAGATTACG CTATGCTTCG 
ACAACAAGTT TTGTTGTTAA TTTATTTGTT AATAATAGTG CGGCTGGCTT TACTTTACCG 
AGTACAATGG CTCAAAATGG TTCTTTAACA TACGAGTCGT TTAATACCTT AGAGGTAACT 
CATACTATTA GATTTTCACA GTCAGATACT ACACTTAGGT TGAATATATT CCCGTCTATC 300 
TCTGGTCAAG AAGTGTATGT AGATAAACTT GAAATCGTTC CAATTAACCC GACACGAGAA 
GCGGAAGAAG ATTTAGAAGA TGCAAAGAAA GCGGTGGCGA GCTTGTTTAC 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 137 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



60 
120 
180 

240 



360 
410 
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(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 19: 

Pro Gly Phe Xaa Qly Gly Asp He hen hrg Arg Thr Ser Pro Gly Gin 
1 5 10 15 

lie Ser Thr Leu Arg Val Asn He Thr Ala Pro Leu Ser Gin Arg Tyr 
20 25 30 

Arg val Arg He Arg Tyr Ala Xaa Thr Thr Asn Leu Gin Phe His Thr 
35 40 45 



Ser He Asp Gly Arg Pro He Asn Gin Gly Asn Phe Ser Ala Thr Met 

SO 55 — eo 

ser ser Gly Ser Asn Leu Gin Ser Gly Ser Phe Arg Thr Val Gly Phe 



65 



70 75 80 



Thr Thr Pro Phe Asn Phe Ser Asn Gly Ser Ser Val Phe Thr Leu Ser 
85 90 95 

Ala His val Phe Asn Ser Gly Asn Qlu Val Tyr He Asp Arg He Glu 
100 105 HO 

Phe val Pro Ala Glu Val Thr Phe Glu Ala Glu Tyr Asp Leu Glu Arg 
115 120 125 

Ala Gin Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 413 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLEOJLE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
CCAGGWTTTA YAGGAGQAGA TATTCTTCGA AGAACTTCAC CTGGCCAGAT TTCAACCTTA 
AGAGTAAATA TTACTGCACC ATTATCACAA AGATATCGGG TAAGAATTCG CTACGCTTYT 
ACYACAAATT TACAATTCCA TACATCAATT GACGGAAGAC CTATTAATCA GGGKAATTTT 
TCAGCAACTA TGAGTAGTGG GAGTAATTTA CAGTCCGGAA GCTTTAGGAC TGTAGGTTTT 
ACTACTCCGT TTAACTTTTC AAATOGATCA AQTGTATTTA CGTTAAGTGC TCATGTCTTC 
AATTCAGGCA ATGAAGTTTA TATAGATCGA ATTGAATTTG TTCCGGCAQA AGTAACCTTT 
GAGGCAGAAT ATGATTTAGA AAGAGCACAA AAOGCGGTGG CQAGCTTGTT TAG 



60 
120 
IBO 
240 
300 
360 
413 
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(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 106 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Phe Thr Gly Gly^p He Leu Arg Arg Asn-Thr He Gly Glu Phe Val _ 
1 5 10 15 

Ser Leu Gin Val Asn He Asn Ser Pro He Thr Gin Arg Tyr Arg Leu 
20 25 30 

Arg Phe Arg Tyr Ala Ser Ser Arg Asp Ala Arg He Thr Val Ala He 
35 40 45 

Gly Gly Gin He Arg Val Asp Met Thr Leu Glu Lys Thr Met Glu He 
50 55 60 

Glv Glu Ser Leu Thr Xaa Arg Thr Phe Ser Tyr Thr Asn Phe Ser Asn 
65 70 75 80 

Pro Phe ser Phe Arg Ala Asn Pro Asp He He Arg He Ala Glu Glu 
85 90 

Leu Pro He Arg Gly Gly Glu Leu Val Tyr 
100 105 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 318 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear" 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

TTTACAGGAG GGGATATCCT TCGAAGAAAT ACCATTGGTG AGTTTGTGTC TTTACAAGTC 60 

AATATTAACT CACCAATTAC CCAAAGATAC CGTTTAAGAT TTCGTTATGC TTCCAGTAGG 120 

GATGCACGAA TTACTGTAGC GATAGGAGGA CAAATTAGAG TAGATATGAC CCTTGAAAAA 180 

ACCATGGAAA TTGGGGAGAG CTTAACATYT AGAACATTTA GCTATACCAA TTTTAGTAAT 240 
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CCTTTTTCAT TTAGGGCTAA TCCAGATATA ATTAC3AATAQ CTGAAC3AACT TCCTATTCGC 300 

318 

GGTGGCGAGC TTGTTTAC 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein " ^ 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

He pro Leu Val Ser Leu Cys Leu Tyr Lys Ser He Leu Thr His Gin 
is 10 15 



Leu pro Lys Asp Thr Val Xaa Xaa Phe Val Met Leu Pro Val Gly Met 
20 25 

His Glu Leu Leu Xaa Arg Xaa Glu Asp Lys Leu Qlu Xaa He Xaa Pro 
35 *° *^ 

Leu LVB Lys pro Trp Lys Leu Gly Arg Ala Xaa His Leu Glu His Leu 

50 55 60 

Ala lie Pro He Leu Val He Leu Phe His Leu Gly Leu He Gin lie 



50 

» Pro He Leu Val He Leu Phe His 
65 ' •'O 

xaa Leu Glu Xaa Leu Lys Asn Phe Leu Phe Ala Val Ala Ser Leu Phe 

90 



85 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 292 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
AAATACCATT GGTGAGTTTG TGTCTTTACA AGTCAATATT AACTCACCAA TTACCCAAAG 

ATACCG-rrrA aratttcgtt atgcttccag tagggatgca cgaattactg tagcgatagg 

AGGACAAATT AGAGTAGATA TGACCCTTGA AAAAACCATG GAAATTGGGG AGAGCTTAAC 
ATCTAGAACA TTTAGCTATA CCAATTTTAG TAATCCTTTT TCATTTAGGG CTAATCCAGA 



60 
120 
180 
240 
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TATT^TtAGA 



ATAGCTGAAG AACTTCCTAT TCGCGGTGGC GAGCTTGTTT AC 292 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 108 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECUIiE TYPE: protein ^ 
(xiUSEQUENCE DESCRIPTION: SEQ ID N0:25: 

pro Gly Phe Xaa Gly Gly Asp He Leu Arg Arg Asn Thr He Gly Glu 

c 10 



Phe val Ser I,eu Gin Val Asn He Asn Ser Pro He Thr Gin Arg Tyr 

20 25 
^ .eu Arg Phe Arg Tyr Ala Ser Ser Arg Asp Ala Arg He Thr Val 
35 

Ala lie Gly Gly Gin He Arg Val Xaa Met Thr .eu Glu Lys Thr Met 

50 55 60 

Olu lie Gly Glu ser X,eu Thr Ser Arg Thr Phe Ser Tyr «.r Asn Phe 
65 "^^ 

ser Asn Pro Phe Ser Phe Arg Ala Asn Pro Asp He He Arg He Ala 
85 ^" 

Glu Glu Leu Pro He Arg Gly Gly Glu Leu Val Tyr 



100 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 324 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
CCAGGWTriA YAGGAGGGGA TATCCTTCGA AGAAATACCA TT^GTGAGTT TOTGTCTTTA 
CAAGTCAATA TTAACTCACC AATTACCCAA AGATACCGTT TAAGATTTCG TTATGCTTCC 
AGTAGGGATO CACGAATTAC TGTAGCGATA GGAGGACAAA TTAGAGTAKA TATGACCCTT 



60 
120 
180 



324 
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GAAAAAACCA TOGAAATTGG GGAGAGCTTA ACATCTAGAA CATTTAGCTA TACCAA^TT 240 
AGTAATCCTT TTTCATTTAG GGCTAATCCA GATATAATTA GAATAGCTGA AGAACTTCCT 300 
ATTCGCGGTG GCGAGCTTGT TTAC 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 136 amino acids 

(B) TYPE: amino acid 
" (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

Gly Phe Xaa Gly Gly Asp Val He Arg Arg Thr Asn Thr Gly Gly Phe 

c 10 

1 5 

Gly Ala lie Arg Val Ser Val Thr Gly Pro Leu Thr Gin Arg Tyr Arg 
20 25 

Xle Arg Phe Arg Tyr Ala Ser Thr He Asp Phe Asp Phe Phe Val Thr 

35 4° 
Arg Gly Gly Thr Thr He Asn Asn Phe Arg Phe Thr Arg Thr Met Asn 

KC wO 

50 55 
Arg Gly Gin Glu Ser Arg Tyr Glu Ser Tyr Arg Thr Val Glu Phe Thr 
65 70 75 

Thr pro Phe Asn Phe Thr Gin Ser Gin Asp He He Arg Thr Xaa He 
85 

Oln Qly Leu Ser Gly Asn Gly Glu Val Tyr Leu Asp Arg lie Glu He 
100 

He Pro val Asn Pro Thr Arg Glu Ala Glu Glu Asp Leu Glu Ala Ala 

" X A D 



115 



120 



Lys 



Lys Ala val Ala Ser Leu Phe 
130 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 411 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
AGGATTTAYA GGAGGAGATG TAATCCGAAG AACAAATACT GGTGGATTCG GAGCAATAAG 
GGTGTCGGTC ACTGGACCGC TAACACAACG ATATCGCATA AGGTTCCGTT ATGCTTCGAC 
AATAGATTTT GATTrCTTTG TAACACGTGG AGGAACTACT ATAAATAATT TTAGATTTAC 
ACGTACAATG AACAGGGGAC AGGAATCAAG ATATGAATCC TATCGTACTG TAGAGTTTAC 
AACTCCTTTT AACTTTACAC AAAGTOVAGA-TATAATTCGA ACAYCTATCC AGGGACTTAG 
TGGAAATGGG GAAGTATACC TTGATAGAAT TGAAATCATC CCTGTAAATC CAACACGAGA 
AGCGGAAGAR GATTTAGAAG CGGCGAAGAA AGCGGTGGCG AGCTTGTTTA C 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 136 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

pro Gly Phe He Gly Gly Ala Leu Leu Gin Arg Thr Asp Hie Gly Ser 

C 10 



60 
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Leu Gly Val Leu Arg Val Gin Phe Pro Leu His Leu Arg Gin Gin Tyr 

20 

Arg lie Arg Val Arg Tyr Ala Ser Thr Thr Asn lie Arg Leu Ser Val 

35 

Asn Gly ser Phe Gly Thr lie Ser Gin Asn Leu Pro Ser Thr Met Arg 

50 55 60 

Leu Gly Glu ASP Leu Arg Tyr Gly Ser Phe Ala He Arg Glu Phe Asn 
65 '0 

Thr ser He Arg Pro Thr Ala Ser Pro Asp Gin He Arg Leu Thr He 
85 

Glu pro ser Phe He Arg Gin Glu val lyr Val Asp Arg lie Glu Phe 

100 

lie pro val Asn Pro Thr Arg Glu Ala Lys Glu Asp Leu Glu Ala Ala 



115 



120 
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Lys Lvs Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 410 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



t±±)- MOI'E'^'^^ (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
CCAGOATTTA TAOGAGOAGC TCTACTTCAA AGGACTGACC ATGGTTCGCT TGGAGTATTG 
AGGGTCCAAT n^CACTTCA CTTAAOACAA CAATATCGTA .TAGAOTCCG TTATGCTTCT 
ACAACAAATA TTCGATICAG TGT«AATGGC AGTTTCGGTA CTATTTCTCA AAATCTCCCT 
;«,TACAATGA GATTAGGAGA GGATTTAAGA TACGGATCTT TTGCTATAAG AGAGTTTAAT 
.CTTCTArrA OACCCACTGC AAGTCCGGAC CAAArTCOAT TGACAATAGA ACCATCTTTT 
.TTAGACAAG AOGTCTAO^T AGATAGAATT GAGTrCATTC CAGTTAATCC GACGCGAGAG 
GCGAAAGAGG ATCTAGAAGC AGCAAAAAAA GCGGTGGCGA GCTTGTTTAC 

(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 142 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:31: 

pro Gly Phe Xaa Gly Gly Gly He Leu Arg Arg Thr Thr Asn Gly Thr 



60 
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240 
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1 



Phe Gly l^r Leu Arg Val Thr Val Asn Ser Pro Leu Thr Gin Arg Tyr 

20 2^ 
Arg val Arg Val Arg Phe Ala Ser Ser Gly Asn Phe Ser He Arg He 

35 *° 
Leu Arg Gly Asn Ser He Ala Tyr Gin Arg Phe Gly Ser Thr Met 



50 55 
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r^n Arg Gly Gin Glu Leu T.r Tyr Glu Ser Phe Val «xr Ser Glu Phe 

65 ^" 

Thr Thr Asn Gin Ser Asp Leu Pro Phe Thr Phe Thr Gin Ala Gin Glu 
85 

.sn Leu Thr Xle Leu Ala Glu Gly Val Ser Thr Gly Ser Glu Tyr Phe 
100 



lie ASP Arg He Glu He He Pro Val Asn Pro Ala Arg Glu Ala Glu 
115 

^Glu ASP .eu Glu Ala Ala Lys Lys Ala Val Ala Ser Leu Phe 



135 
130 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 428 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: 



CCAGGWTTTA 


YAGGAGGGGG TATACTCCGA 


AGAACAACTA 


ATGGCACATT TGGAACGTTA 


60 


AGAGTAACAG 


TTAATTCACC ATTAACACAA 


AGATATCGCG 


TAAGAGTTCG TTTTGCTTCA 


120 


TCAGGAAATT 


TCAGCATAAG GATACTGCGT 


GGAAATACCT 


CTATAGCTTA TCAAAGATTT 


180 


GGGAGTACAA 


TGAACAGAGG ACAGGAACTA 


ACTTACGAAT 


CATTTGTCAC AAGTGAGTTC 


240 


ACTACTAATC 


AGAGCGATCT GCCTTTTACA 


TTTACACAAG 


CTCAAGAAAA TTTAACAATC 


300 


CTTGCAGAAG 


GTGTTAGCAC CGGTAGTGAA 


TATTTTATAG 


ATAGAATTGA AATCATCCCT 


360 


GTGAACCCGG 


CACGA6AAGC AGAAGAGGAT 


TTAGAAGCRG 


CGAAGAAAGC GGTGGCGAGC 


420 
428 



TTGTTTAC 

/ 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 136 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



PCT/US98/26585 

WO 99/33991 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: 

pro Gly Phe He Gly Gly Ala Leu Leu Gin Arg Thr Asp His Gly Ser 
1 5 10 15 



Leu Gly val Leu Arg Val Gin Phe Pro Leu His Leu Arg Gin Gin Tyr 
20 25 30 

Arg lie Arg Val Arg Tyr Ala Ser Thr Thr Asn He Arg Leu Ser Val 



35 



40 45 



Asn Gly ser Phe Gly Thr lie Ser Gin Asn Leu Pro Ser Thr Met Arg 



50 



55 6^' 



Leu Gly Glu ASP Leu Arg Tyr Gly Ser Phe Ala He Arg Glu Phe Asn 
65 70 75 80 

Thr ser He Arg Pro Thr Ala Ser Pro Asp Gin He Arg Leu Thr He 
85 90 95 

Glu pro ser Phe lie Arg Gin Glu Val Tyr Val Asp Arg lie Glu Phe 
100 105 110 



lie pro Val Asn Pro Thr Arg Glu Ala Lys Glu Asp Leu Glu Ala Ala 
115 



120 125 



Lys Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 410 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: 
CCAGGATTTA TAGGAGOAOC TCTACTTCAA AGQACTGACC ATGGTTCGCT TGGAGTATTG 
AGGGTCCAAT TTCCACTTCA CTTAAGACAA CAATATCGTA TTAGAGTCCG TTATGCTTCT 
ACAACAAATA TTCGATTGAG TGTGAATGQC AGTTTCGGTA CTATTTCTCA AAATCTCCCT 
AGTACAATGA GATTAGGAGA GGATTTAAGA TACGGATCTT TTGCTATAAG AGAGTTTAAT 
ACTTCTATTA GACCCACTQC AAGTCCGGAC CAAATTCGAT TGACAATAGA ACCATCTTTT 
ATTAGACRAG AGGTCTATGT AGATAGAATT GAGTTCATTC CAGTTAATCC GACGCGAGAG 
GCGAAAGAGG ATCTAGAAGC AGCAAAAAAA GCGQTGGCGA GCTTGTTTAC 



60 
120 
180 
240 
300 
360 
410 
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^0 



(2) INFORMATION FOR SEQ ID NO: 35: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 137 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQXJENCE DESCRIPTION: SEQ ID NO: 35: ^ 

pro Gly Phe Thr Gly Gly J^p He Leu Arg Arg Thr Gly Val Gly Thr 
1 5 10 15 

Phe Gly Thr He Arg Val Arg Thr Thr Ala Pro Leu Thr Gin Arg Tyr 
20 25 30 



Arg lie Arg Phe Arg Phe Ala Ser Thr Thr Asn Leu Phe He Gly He 
35 40 45 

Arg val Gly Asp Arg Gin Val Asn Tyr Phe Asp Phe Gly Arg Thr Met 



50 



55 60 



Asn Arg Gly Asp Glu Leu Arg Tyr Glu Ser Phe Ala Thr Arg Glu Phe 
65 70 75 80 

Thr Thr Asp Phe Asn Phe Arg Gin Pro Gin Glu Leu He Ser Val Phe 
85 50 95 

Ala Asn Ala Phe Ser Ala Gly Gin Glu Val Tyr Phe Asp Arg He Glu 
100 105 110 

He He Pro Val Asn Pro Ala Arg Glu Ala Lys Glu Asp Leu Glu Ala 
115 120 125 

Ala Lys Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 413 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 



CCAGGTTTTA CAGGAGGGGA TATACTCCGA AGAACAGGGG TTGGTACATT TGGAACAATA 



60 
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AGGGTAAGGA CTACTOCCCC CTTAACACAA AGATATCGCA TAAGATTCCG TTTCGCTTCT 
ACCACAAATT TGTTCATTCG TATAAGAGTT GGTGATAGAC AAGTAAATTA TTTTGACTTC 
GQAAGAACAA TGAACAGAGG AGATGAATTA AGGTACGAAT CTTTTGCTAC AAGGGAGTTT 
ACTACTGATT TTAATTTTAG ACAACCTCAA GAATTAATCT CAGTGTrTGC AAATGCATTT 
AGCGCTGGTC AAOAAGTTTA TTTTGATAGA ATTCAGATTA TCCCCGTTAA TCCCGCACGA 
GAGGCGAAAG AGGATCTAGA AGCAGCAAAG AAAGCGGTGG CGAGCTTGTT TAG 

(2) INFORMATION FOR SBQ ID NO:37j- 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 137 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

pro Gly Phe Thr Gly Gly Asp He Leu Arg Arg Thr Ser Pro Gly Gin 
1 5 ^° 

lie ser Thr Leu Arg Val Asn He Thr Ala Pro Leu Ser Gin Arg Tyr 
20 25 30 

Arg val Arg He Arg Tyr Ala Ser Thr Thr Asn Leu Gin Phe His Thr 
35 40 45 

ser He Asp Gly Arg Pro He Asn Gin Gly Asn Phe Ser Ala Thr Met 
50 55 60 

ser ser Gly Ser Asn Leu Gin Ser Gly Ser Phe Arg Thr Val Gly Phe 



120 
180 
240 
300 
360 
413 



65 70 75 



Thr Thr pro Phe Asn Phe Ser Asn Gly Ser Ser Val Phe Thr Leu Ser 
85 »0 

Ala His val Phe Asn Ser Gly Asn Glu Val Tyr He Asp Arg He Glu 
100 105 

Phe val Pro Ala Glu Val Thr Phe Glu Ala Glu Tyr Asp Leu Glu Arg 
115 120 125 

Ala Gin Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO: 38: 



PCTAJS98/26585 

WO 99/33991 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 413 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
CCAGGWTTTA CAGGAGGAGA TATTCTTCGA AGAACTTCAC CTGGCCAGAT TTCAACCTTA 
AGAGTAAATA TTACTGCACC ATTATCACAA AGATATCGGG TAAGAATTCG CTACGCTTCT 
ACCACAAATT TACAATTCCA TACATCAATT GACGGAAGAC CTATTAATCA GGGGAATTTT 
TCAGCAACTA TGAGTAGTGG GAGTAATTTA CAGTCCGGAA GCTTTAGGAC TGTAGGTTTT 
ACTACTCCGT TTAACTTTTC AAATGGATCA AGTGTATTTA CGTTAAGTGC TCATGTCTTC 
AATTCAGGCA ATGAAGTTTA TATAGATCGA ATTGAATTTG TTCCGGCAGA AGTAACCTTT 
GAGGCAGAAT ATGATTTAGA AAGAGCACAR AAGGCGGTGG CGAGCTTGTT TAC 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 137 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

Pro Gly Phe Thr Gly Gly Asp lie Leu Arg Arg Thr Gly Val Gly Thr 



60 
120 
180 
240 
300 
360 
413 



1 



5 10 15 



Phe Gly Thr He Arg Val Arg Thr Thr Ala Pro Leu Thr Gin Arg Tyr 
20 , 25 30 

Arg lie Arg Phe Arg Phe Ala Ser Thr Thr Asn Leu Phe He Gly He 
35 40 45 

Arg Val Gly Asp Arg Gin Val Asn Tyr Phe Asp Phe Gly Arg Thr Met 
50 55 60 

Asn Arg Gly Asp Glu Leu Arg Tyr Glu Ser Phe Ala Thr Arg Glu Phe 
65 70 75 80 

Thr Thr Asp Phe Asn Phe Arg Gin Pro Gin Glu Leu He Ser Val Phe 
85 90 95 



wo 99/33991 



PCT/US98/26585 



03^ 

Ala Asn Ala Phe Ser Ala Gly Gin Glu Val Tyr Phe Asp Arg He Glu 



100 105 



110 



lie lie pro Val Asn Pro Ala Arg Glu Ala Lys Glu Asp Leu Glu Ala 



125 



115 120 

Ala Lys Lys Ala Val Ala Ser Leu Phe 
130 135 

(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

lA) LENGTH: 413 base pairs ^ 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
CCAGGTTTTA CAGGAGGGGA TATACTCCGA AGAACAGGGG TTGGTACATT TGGAACAATA 
AGGGTAAGGA CTACTGCCCC CTTAACACAA AGATATCGCA TAAGATTCCG TTTCGCTTCT 
ACCACAAATT TGTTCATTGG TATAAGAGTT GGTGATAGAC AAGTAAATTA TTTTGACTTC 
GGAAGAACAA TGAACAGAGG AGATGAATTA A6GTACGAAT CTTTTGCTAC AAGGGAGTTT 
ACTACTGATT TTAATTTTAG ACAACCTCAA GAATTAATCT CAGTGTTTGC AAATGCATTT 
AGCGCTGGTC AAGAAGTTTA TTTTGATAGA ATTGAGATTA TCCCCGTTAA TCCCGCACGA 
GAGGCGAAAG AGGATCTAGA AGCAGCAAAG AAAGCGGTGG CGAGCTTGTT TAC 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 137 amino acids 

(B) TYPE: aiSino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

Pro Gly Phe Thr Gly Gly Asp He Leu Arg Arg Thr Asn Ala Gly Asn 
15 10 15 

Phe Gly Asp Met Arg Val Asn He Thr Ala Pro Leu Ser Gin Arg Tyr 
20 25 30 



60 
120 
180 
240 
300 
360 
413 
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Arg Val Arg He Arg Tyr Ala Ser Thr Ala Asn Leu Oln Phe His Thr 
35 40 45 

Ser He Asn Gly Arg Ala He Asn Gin Ala Asn Phe Pro Ala Thr Met 
50 55 60 

Asn Ser Gly Glu Asn Leu Gin Ser Gly Ser Phe Arg Val Ala Gly Phe 
65 70 75 80 

Thr Thr Pro Phe Thr Phe Ser Asp Ala Leu Ser Thr Phe Thr He Gly 
85 90 95 

Ala Phe Ser_5he Ser Ser Asn Asn Glu Val Tyr I-ie-Asp Arg He Glu 
' 100 105 110 _ 

Phe Val Pro Ala Glu Val Thr Phe Ala Thr Glu Ser Asp Gin Asp Arg 
115 120 125 

Ala Gin Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 413 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

CCAGGWTTTA CAGGAGGGGA TATCCTTCGA AQAACGAATG CTGGTAACTT TGGAGATATG 60 

CGTGTAAACA TTACTGCACC ACTATCACAA AGATATCGCG TAAGGATTCG TTATGCTTCT 120 

ACTGCAAATT TACAATTCCA TACATCAATT AACGGAAGAG CCATTAATCA GGCGAATTTC 180 

CCAGCAACTA TGAACAGTGG GGAGAATTTA CAGTCCGGAA GCTTCAGGGT TGCAGGTTTT 240 

ACTACTCCAT TTACCTTTTC AGATGCACTA AGCACATTCA CAATAGGTGC TTTTAGCTTC 300 

TCTTCAAACA ACGAAGTTTA TATAGATCGA ATTGAATTTG TTCCGGCAGA AGTAACATTT 360 

GCAACAGAAT CTGATCAGGA TAGAGCACAA AAGGCGGTGG CGAGCTTGTT TAG 413 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 136 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

Pro Gly Phe He Gly Gly Ala Leu Leu Gin Arg Thr Asp His Gly Ser 
1 5 10 15 

Leu Gly Val Leu Arg Val Gin Phe Pro Leu His Leu Arg Gin Gin Tyr 
20 25 30 

-Arg He Arg VaL_Arg Tyr Ala Ser Thr Thr^n He ArgnTeu Ser Val 

35 " 40 45 _ 

Asn Gly Ser Phe Gly Thr He Ser Gin Asn Leu Pro Ser Thr Met Arg 
50 55 60 

Leu Gly Glu Asp Leu Arg Tyr Gly Ser Phe Ala He Arg Glu Phe Asn 
65 70 75 80 

Thr Ser He Arg Pro Thr Ala Ser Pro Asp Gin He Arg Leu Thr He 
85 90 95 

. Glu Pro Ser Phe He Arg Gin Glu Val Tyr Val Asp Arg He Glu Phe 
100 105 110 

He Pro Val Asn Pro Thr Arg Glu Ala Lys Glu Asp Leu Xaa Ala Ala 
115 120 125 

Lys Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 410 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

CCAGGATTTA TAGGAGGAGC TCTACTTCAA AGGACTGACC ATGGTTCGCT TGGAGTATTG 60 

AGG6TCCAAT TTCCACTTCA CTTAAGACAA CAATATCGTA TTAGAGTCCG TTATGCTTCT 120 

ACAACAAATA TTCGATTGAG TGTGAATGGC AGTTTCGGTA CTATTTCTCA AAATCTCCCT 180 

AGTACAATGA GATTAGGAGA GGATTTAAGA TACGGATCTT TTGCTATAAG AQAGTTTAAT 240 

ACTTCTATTA GACCCACTGC AAGTCCGGAC CAAATTCGAT TGACAATAGA ACCATCTTTT 300 
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ATTAGACAAG AGGTCTATGT AGATAGAATT GAGTTCATTC CAGTTAATCC GACGCGAGAG 360 
GCGAAAGAGG ATCTAKAAGC AGCAAAAAAA GCGGTGGCGA OCTTGTTTAC 410 



(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 137 amino acids 

(B) TYPE; amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear ^ 

(ii) MOLECULE'TYPE : protein " 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

Gin Xaa Leu Ser Gly Gly Asp Val lie Arg Arg Thr Asn Thr Gly Gly 
15 10 15 

Phe Gly Ala lie Arg Val Ser Val Thr Gly Pro Leu Thr Gin Arg Tyr 
20 25 30 

Arg lie Arg Phe Arg Tyr Ala Ser Thr He Asp Phe Asp Phe Phe Val 
35 40 45 

Thr Arg Gly Gly Thr Thr He Asn Asn Phe Arg Phe Thr Arg Thr Met 
50 55 60 

Asn Arg Gly Gin Glu Ser Arg Tyr Glu Ser Tyr Arg Thr Val Glu Phe 
65 70 75 80 

Thr Thr Pro Phe Asn Phe Thr Gin Ser Gin Asp He He Arg Thr Ser 
85 90 95 

He Gin Gly Leu Ser Gly Asn Gly Glu Val Tyr Leu Asp Arg He Glu 
100 105 110 

He He Pro Val Asn Pro Thr Arg Glu Ala Glu Glu Asp Leu Glu Ala 
115 120 125 

Ala Lys Lys Ala Val Ala "Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 414 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

CCAGGWTTTA tCAGGAGGAG ATGTAATCCG AAGAACAAAT ACTGGTGGAT TCGGAGCAAT 60 

AAGGGTGTCG GTCACTGGAC CGCTAACACA ACGATATCGC ATAAGGTTCC GTTATGCTTC 120 

GACAATAGAT TTTGATTTCT TTGTAACACG TGGAGGAACT ACTATAAATA ATTTTAGATT 180 

TACACGTACA ATGAACAGGG GACAG6AATC AAGATATGAA TCCTATCGTA CTGTAGAGTT 240 

TACAACTCCT TTTAACTTTA CACAAAGTCA AGATATAATT CGAACATCTA TCCAGGGACT 300 

TAGTGGAAAT GGGGAAGTAX-^ACCTTGATAG AATTGAAATC ATCCCTGTAA-ATCCAACACG 360 

AGAAGCGGAA GARGATTTAG AAGCGGCGAA GAAAGCGGTG GCGAGCTTGT TTAC 414 

(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 142 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

Pro Gly Phe Thr Gly Gly Gly He Leu Arg Arg Thr Thr Asn Gly Thr 
15 10 15 

Phe Gly Thr Leu Arg Val Thr Val Asn Ser Pro Leu Thr Gin Arg Tyr 
20 25 30 

Arg Val Arg Val Arg Phe Ala Ser Ser Gly Asn Phe Ser He Arg He 
35 40 45 

Leu Arg Gly Asn Thr Ser He Ala Tyr Gin Arg Phe Gly Ser Thr Met 
50 55 60 

Asn Arg Gly Gin Glu LeuTfhr Tyr Glu Ser Phe Val Thr Ser Glu Phe 
65 70 75 80 

Thr Thr Asn Gin Ser Asp Leu Pro Phe Thr Phe Thr Gin Ala Gin Glu 
85 90 95 

Asn Leu Thr He Leu Ala Glu Gly Val Ser Thr Gly Ser Glu Tyr Phe 
IQO 105 110 

He Asp Arg He Glu He He Pro Val Asn Pro Ala Arg Glu Ala Glu 
115 120 125 

Glu Asp Leu Glu Ala Ala Lys Lys Ala Val Ala Ser Leu Phe 
130 135 140 
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(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 428 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:48: 

CCAGGWTTTA CAGGAGGGGG TATACTCCGA AGAACAACTA ATGGCACATT TGGAACGTTA 60 

AGAGTAACAG TTAATTCACC ATTAACACAA AGATATCGCG TAAGAGTTCG TTTTGCTTCA 120 

TCAGGAAATT TCAGCATAAG GATACTGCGT GGAAATACCT CTATAGCTTA TCAAAGATTT 180 

GGGAGTACAA TGAACAGAGG ACAGGAACTA ACTTACGAAT CATTTGTCAC AAGTGAGTTC 240 

ACTACTAATC AGAGCGATCT GCCTTTTACA TTTACACAAG CTCAAGAAAA TTTAACAATC 300 

CTTGCAGAAG GTGTTAGCAC CGGTAGTQAA TATTTTATAG ATAGAATTGA AATCATCCCT 360 

GTGAACCCGG CACGAGAAGC AGAAGAGGAT TTAGAAGCAG CGAAGAAAGC GGTGGCGAGC 420 

TTGTTTAC 428 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 136 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

Pro Gly Phe lie Gly Gly Ala Leu Leu Gin Arg Thr Asp His Gly Ser 
1 5 10 15 

Leu Gly Val Leu Arg Val Gin Phe Pro Leu His Leu Arg Gin Gin Tyr 



20 



25 



30 



Arg 



lie Arg Val Arg Tyr Ala Ser Thr Thr Asn lie Arg Leu Ser Val 
35 40 45 



Asn 



Gly Ser Phe Gly Thr lie Ser Gin Asn Leu Pro Ser Thr Met Arg 
50 55 60 
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Leu Gly Glu Asp Leu Arg Tyr Gly Ser Phe Ala lie Arg Glu Phe Asn 
65 70 75 80 

Thr Ser lie Arg Pro Thr Ala Ser Pro Asp Gin lie Arg Leu Thr lie 
85 90 95 

Glu Pro Ser Phe lie Arg Gin Glu Val Tyr Val Asp Arg lie Glu Phe 
100 105 110 

lie Pro Val Asn Pro Thr Arg Glu Ala Lys Glu Asp Leu Glu Ala Ala 
115 120 125 

Lys Lys Ala Val Ala Ser Leu Phe- 
130 135 



(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 410 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

CCAGGWTTTA TAGGAGGAGC TCTACTTCAA AGGACTGACC ATGGTTCGCT TGGAGTATTG 60 

AGGGTCCAAT TTCCACTTCA CTTAAGACAA CAATATCGTA TTAGAGTCCG TTATGCTTCT 120 

ACAACAAATA TTCGATTGAG TGTGAATGGC AGTTTCGGTA CTATTTCTCA AAATCTCCCT 180 

AGTACAATGA GATTAGGAGA GGATTTAAGA TACGGATCTT TTGCTATAAG AGAGTTTAAT 240 

ACTTCTATTA GACCCACTGC AAGTCCGGAC CAAATTCGAT TGACAATAGA ACCATCTTTT 300 

ATTAGACAAG AGGTCTATGT AGATAGAATT GAGTTCATTC CAGTTAATCC GACGCGAGAG 360 

GCGAAAGAGG ATCTAGAAGC AGCAAAAAAA GCGGTG6CGA GCTTGTTTAC 410 



(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 137 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 
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Pro Gly Phe Thr Gly Gly Asp He Leu Arg Arg Thr Gly Val Gly Thr 
15 10 15 

Phe Gly Thr He Arg Val Arg Thr Thr Ala Pro Leu Thr Gin Arg Tyr 
20 25 30 

Arg He Arg Phe Arg Phe Ala Ser Thr Thr Asn Leu Phe He Gly He 
35 40 45 

Arg Val Gly Asp Arg Gin Val Asn Tyr Phe Asp Phe Gly Arg Thr Met 
50 55 60 

Asn Arg Gly Asp Glu Leu Arg Tyr Glu Ser Phe Ala Thr Arg Glu Phe 
65 _ 70 75 80 

Thr Thr Asp Phe Asn Phe Arg Gin Pro Gin Glu Leu He Ser Val Phe 
85 90 95 

Ala Asn Ala Phe Ser Ala Gly Gin Glu Val Tyr Phe Asp Arg He Glu 
100 105 110 

He He Pro Val Asn Pro Ala Arg Glu Ala Lys Glu Asp Leu Glu Ala 
115 120 125 

Ala Lys Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 412 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52:. 

CCAGGTTTTA CAGGAGGGGA TATACTCCGA AGAACAGGGG TTGGTACATT TGGAACAATA 60 

AGGGTAAGGA CTACTGCCCC CTTAACACAA AGATATCGCA TAAGATTCCG TTTCGCTTCT 120 

ACCACAAATT TGTTCATTGG TATAAGAGTT GGTGATAGAC AAGTAAATTA TTTTGACTTC 180 

GGAAGAACAA TGAACAGAGG AGATGAATTA AGGTACGAAT CTTTTGCTAC AAGGGAGTTT 240 

ACTACTGATT TTAATTTTAG ACAACCTCAA GAATTAATCT CAGTGTTTGC AAATGCATTT 300 

AGCGCTGGTC AAGAAGTTTA TTTTGATAGA ATTGAGATTA TCCCCGTTAA TCCCGCACGA 360 

GAGGCGAAAG AGGATCTAGA AGCAGCAAAG AAAGCGGTGG CGAGCTTGTT TA 412 
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(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 137 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

Pro Gly Phe Thr Gly Gly Asp Val-Ile Arg Arg Thr Asn Thr Gly Gly 
1 _ 5 10 15 

Phe Gly Ala lie Arg Val Ser Val Thr Gly Pro Leu Thr Gin Arg Tyr 
20 25 30 

Arg lie Arg Phe Arg Tyr Ala Ser Thr lie Asp Phe Asp Phe Phe Val 
35 40 45 

Thr Arg Gly Gly Thr Thr lie Asn Asn Phe Arg Phe Thr Arg Thr Met 
50 55 60 

Asn Arg Gly Gin Glu Ser Arg Tyr Glu Ser Tyr Arg Thr Val Glu Phe 
65 70 75 80 

Thr Thr Pro Phe Asn Phe Thr Gin Ser Gin Asp lie lie Arg Thr Ser 
85 90 95 

He Gin Gly Leu Ser Gly Asn Gly Glu Val Tyr Leu Asp Arg He Glu 
100 105 110 

He He Pro Val Asn Pro Thr Arg Glu Ala Glu Glu Asp Xaa Glu Ala 
115 120 125 

Ala Lys Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 413 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
CCAGGATTTA CAGGAGGAGA TGTAATCCGA AGAACAAATA CTGGTGGATT CGGAGCAATA 60 



AGGGTGTCGG TCACTGGACC GCTAACACAA CGATATCGCA TAAGGTTCCG TTATGCTTCG 120 
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ACAATAGATT TTGATTTCTT TGTAACACGT GGAGGAACTA CTATAAATAA TTTTAGATTT 180 

ACACGTACAA TGAACAGGGG ACAGGAATCA AGATATGAAT CCTATCGTAC TGTAGAGTTT 240 

ACAACTCCTT TTAACTTTAC ACAAAGTCAA GATATAATTC GAACATCTAT CCAGGGACTT 300 

AGTGGAAATG GGGAAGTATA CCTTGATAGA ATTGAAATCA TCCCTGTAAA TCCAACACGA 360 

GAAGCGGAAG AG6ATTTWGA AGCGGCGAAG AAAGCGGTGG CGAGCTTGTT TAG 413 

(2) INFORMATION FOR SEQ ID NO: 55: 

_ (i) SEQUENCE CHARACTERISTICS: ' 

(A) LENGTH: 136 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

Pro Gly Phe lie Gly Gly Ala Leu Leu Gin Arg Thr Asp His Gly Ser 
15 10 15 

Leu Gly Val Leu Arg Val Gin Phe Pro Leu His Leu Arg Gin Gin Tyr 
20 25 30 

Arg He Arg Val Arg Tyr Ala Ser Thr Thr Asn He Arg Leu Ser Val 
35 . 40 45 

Asn Gly Ser Phe Gly Thr He Ser Gin Asn Leu Pro Ser Thr Met Arg 
50 55 60 

Leu Gly Glu Asp Leu Arg Tyr Gly Ser Phe Ala He Arg Glu Phe Asn 
65 70 75 80 

Thr Ser He Arg Pro Thr Ala Ser Pro Asp Gin He Arg Leu Thr He 
85 90 95 

Glu Pro Ser Phe He Arg Gin Glu Val Tyr Val Asp Arg He Glu Phe 
100 105 110 

He Pro Val Asn Pro Thr Arg Glu Ala Lys Xaa Asp Leu Xaa Ala Ala 
115 120 125 

Lys Lys Ala Val Ala Ser Leu Phe 
130 135 

(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 410 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

CCAGGATTTA TAGGAGGAGC TCTACTTCAA AGGACTGACC ATGGTTCGCT TGGAGTATTG 60 

AGGGTCCAAT TTCCACTTCA CTTAAGACAA CAATATCGTA TTAGAGTCCG TTATGCTTCT 120 

ACAACAAATA TTCGATTGAG TGTGAATGGC AGTTTCGGTA JTTATTTCTCA AAATCTCCCT 180 

AGTACAATGA GATTAGGAGA GGATTTAAGA TACGGATCTT TTGCTATAAG AGAGTTTAAT 240 

ACTTCTATTA GACCCACTGC AAGTCCGGAC CAAATTCGAT TGACAATAGA ACCATCTTTT 300 

ATTAGACAAG AGGTCTATGT AGATAGAATT GAGTTCATTC CAGTTAATCC GACGCGAGAG 360 

GCGAAAGAKG ATCTABAAGC AGCAAAAAAA GCGGTG6CGA GCTTGTTTAC 410 

(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 137 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

Pro Gly Phe Thr Gly Gly Asp Val He Arg Arg Thr Asn Thr Gly Gly 
15 10 15 

Phe Gly Ala He Arg Val Ser Val Thr Gly Pro Leu Thr Gin Arg Tyr 
20 25 30 

Arg He Arg Phe Arg Tyr Ala Ser Thr He Asp Phe Asp Phe Phe Val 
35 40 45 

Thr Arg Gly Gly Thr Thr He Asn Asn Phe Arg Phe Thr Arg Thr Met 
50 55 60 

Asn Arg Gly Gin Glu Ser Arg Tyr Glu Ser Tyr Arg Thr Val Glu Phe 
65 70 75 80 

Thr Thr Pro Phe Asn Phe Thr Gin Ser Gin Asp He He Arg Thr Ser 
85 90 95 

He Gin Gly Leu Ser Gly Asn Gly Glu Val Tyr Leu Asp Arg He Glu 
100 105 110 
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m 

He He Pro Val Asn Pro Thr Arg Glu Ala Glu Glu Asp Leu Glu Ala 
115 120 125 

Ala Lys Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 413 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single^ _ — — 
TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 
CCAGGWTTTA CAGOAGGAGA TGTAATCCGA AGAACAAATA CTGGTGGATT CGGAGCAATA 
AGGGTGTCGG TCACTGGACC GCTAACACAA CGATATCGCA TAAGGTTCCG TTATGCTTCG 
ACAATAGATT TTGATTTCTT TGTAACACGT GGAGGAACTA CTATAAATAA TTTTAGATTT 
ACAC6TACAA TGAACAGGGG ACAGGAATCA AGATATGAAT CCTATCGTAC TGTAGAGTTT 
ACAACTCCTT TTAACTTTAC ACAAAGTCAA GATATAATTC GAACATCTAT CCAGGGACTT 
AGTGGAAATG GGGAAGTATA CCTTGATAGA ATTGAAATCA TCCCTGTAAA TCCAACACGA 
GAAGCGGAAG AGGATTTAGA AGCGGCGAAG AAAGCGGTGG CGAGCTTGTT TAC 



(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 142 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

Pro Gly Phe Xaa Gly Gly Gly He Leu Arg Arg Thr Thr Asn Gly Thr 
15 10 15 

Phe Gly Thr Leu Arg Val Thr Val Asn Ser Pro Leu Thr Gin Arg Tyr 
20 25 30 



60 
120 
180 
240 
300 
360 
413 



Arg Val Arg Val Arg Phe Ala Ser Ser Gly Asn Phe Ser He Arg He 
35 40 45 
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35 



Leu Arg Gly Asn Thr Ser lie Ala 
50 55 



Tyr Gin Arg Phe Gly Ser Thr Met 
60 



Asn Arg Gly Gin Glu Leu Thr Tyr 
65 70 



Glu Ser Phe Val Thr Ser Glu Phe 
75 80 



Thr Thr Asn Gin Ser Asp Leu Pro 
85 



Phe Thr Phe Thr Gin Ala Gin Glu 
90 95 



Asn Leu Thr He Leu Ala Glu Gly 
100 



Val Ser Thr Gly Ser Glu Tyr Phe 
105 110 



He Asp Arg He Glu Ile^e Pro 
115 120 



Val Asn Pro Ala Arg -Oiu" Ala Glu 
125 



Glu Asp Leu Glu Ala Ala Lys Lys 
130 135 



Ala Val Ala Ser Leu Phe 
140 



(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 428 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

CCAGGWTTTA YAGGAGGGGG TATACTCCGA AGAACAACTA ATGGCACATT TGGAACGTTA 60 

AGAGTAACAG TTAATTCACC ATTAACACAA A6ATATCGCG TAAGAGTTCG TTTTGCTTCA 120 

TCAGGAAATT TCAGCATAAG GATACTGCGT GGAAATACCT CTATAGCTTA TCAAAGATTT 180 

GG6AGTACAA TGAACAGAGG ACAGGAACTA ACTTACGAAT CATTTGTCAC AAGTGAGTTC 240 

ACTACTAATC AGAGCGATCT GCCTTTTACA TTTACACAAG CTCAAGAAAA TTTAACAATC 300 

CTTGCAGAAG GTGTTAGCAC CGGTAGTGAA TATTTTATAG ATAGAATTGA AATCATCCCT 360 

GTGAACCCGG CACGAGAAGC AGAAGAGGAT TTAGAAGCAG CGAAGAAAGC GGTGGCGAGC 420 

TTGTTTAC "^^S 



(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 136 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



wo 99/33991 PCT/US98fl6S85 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

Pro Gly Phe He Gly Gly Ala Leu Leu Gin Arg Thr Asp His Gly Ser 
15 10 15 

Leu Gly Val Leu Arg Val Gin Phe Pro Leu His Leu Arg Gin Gin Tyr 
20 25 30 

Arg He Arg Val Arg Tyr Ala Ser Thr Thr Asn He Arg Leu Ser Val 
35 40 45 

Asn Gly Ser Phe Gly TEr'lle Ser Gin Asn Leu Pro Ser Thr Met Arg 
50 55 60 

Leu Gly Glu Asp Leu Arg Tyr Gly Ser Phe Ala He Arg Glu Phe Asn 
65 70 75 80 

Thr Ser He Arg Pro Thr Ala Ser Pro Asp Gin He Arg Leu Thr He 
85 90 95 

Glu Pro Ser Phe He Arg Gin Glu Val Tyr Val Asp Arg He Glu Phe 
100 105 110 

He Pro Val Asn Pro Thr Arg Glu Ala Lys Glu Asp Leu Glu Ala Ala 
115 120 125 

Lys Lys Ala Val Ala Ser Leu Phe 
130 135 

(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 410 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

CCAGGTTTTA TAGGAGGAGC TCTACTTCAA AGGACTGACC ATGGTTCGCT TGGAGTATTG 60 

AGGGTCCAAT TTCCACTTCA CTTAAGACAA CAATATCGTA TTAGAGTCCG TTATGCTTCT 120 

ACAACAAATA TTCGATTGAQ TGTGAATGGC AGTTTCGGTA CTATTTCTCA AAATCTCCCT 180 

AGTACAATGA GATTAGGAGA GGATTTAAGA TACGGATCTT TTGCTATAAG AGAGTTTAAT 240 

ACTTCTATTA GACCCACTGC AAGTCCGGAC CAAATTCGAT TGACAATAGA ACCATCTTTT 300 

ATTAGACAAG AGGTCTATGT AGATAGAATT GAGTTCATTC CAGTTAATCC GACGCGAGAG 360 
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GCGAAA6AGG ATCTAGAAGC AGCAAAAAAA GCGGTGGCGA GCTTGTTTAC 



(2) INFORMATION FOR SEQ ID NO: 63.: 

(i) SEQXJENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCR-IPTION : SEQ Id1tO:63: 
GTTCATTGGT ATAAGAGTTG GTG 



(2) INFORMATION FOR SEQ ID NO:64i 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
CCACTGCAAG TCCGGACCAA ATTCG 



(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 
GAATATATTC CCGTCYATCT CTGG 



(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear , 
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(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
GCACGAATTA CTGTAGCGAT AGG 



(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
(Df- TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67 

GCTGGTAACT TTGQAGATAT GCOTG 



(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68 
GATTTCTTTG TAACACGTGG AGG 



(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69 
CACTACTAAT CAGAGCGATC TG 



(2) INFORMATION FOR SEQ ID NO: 70: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 1156 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

Met Asn Gin Asn Lys His Gly lie He Gly Ala Ser Asn Cys Gly Cys 
1 5 10 15 

Ala Ser Asp Asp Val-^^la Lys Tyr Pro Leu Ala Asn Asn Pro Tyr^Ser 
20 25_ 30 

Ser Ala Leu Asn Leu Asn Ser Cys Gin Asn Ser Ser He Leu Asn Trp 
35 40 "45 

He Asn He He Gly Asp Ala Ala Lys Glu Ala Val Ser He Gly Thr 
50 55 60 

Thr He Val Ser Leu He Thr Ala Pro Ser Leu Thr Gly Leu He Ser 
65 70 75 80 

He val Tyr Asp Leu He Gly Lys Val Leu Gly Gly Ser Ser Gly Gin 
85 90 95 

Ser He Ser Asp Leu Ser He Cys Asp Leu Leu Ser He He Asp Leu 
100 105 110 

Arg Val Ser Gin Ser Val Leu Asn Asp Gly He Ala Asp Phe Asn Gly 
115 120 125 

Ser Val Leu Leu Tyr Arg Asn Tyr Leu Glu Ala Leu Asp Ser Trp Asn 
130 135 140 

Lys Asn Pro Asn Ser Ala Ser Ala Glu Glu Leu Arg Thr Arg Phe Arg 
145 150 155 160 

He Ala Asp Ser Glu Phe Asp Arg He Leu Thr Arg Gly Ser Leu Thr 
165 170 175 

Asn Gly Gly Ser Leu Ala Arg Gin Asn Ala Gin He Leu Leu Leu Pro 
180 185 190 

Ser Phe Ala Ser Ala Ala Phe Phe His Leu Leu Leu Leu Arg Asp Ala 
195 200 205 

Thr Arg Tyr Gly Thr Asn Trp Gly Leu Tyr Asn Ala Thr Pro Phe He 
210 215 220 

Asn Tyr Gin Ser Lys Leu Val Glu Leu He Glu Leu Tyr Thr Asp Tyr 
225 230 235 240 
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Cys Val His Trp Tyr Asn Arg Gly Phe Asn Glu Leu Arg Gin Arg Gly 
245 250 255 

Thr Ser Ala Thr Ala Trp Leu Glu Phe His Arg Tyr Arg Arg Glu Met 
260 265 270 

Thr Leu Met Val Leu Asp lie Val Ala Ser Phe Ser Ser Leu Asp He 
275 280 285 

Thr Asn Tyr Pro He Glu Thr Asp Phe Gin Leu Ser Arg Val He Tyr 
290 295 300 

Thr Asp Pro He Gly Phe^al His Arg Ser Ser Leu Arg Gly Glu Sef^ 
305 310 _ 315 320 

Trp Phe Ser Phe Val Asn Arg Ala Asn Phe Ser Asp Leu Glu Asn Ala 
325 330 335 

He Pro Asn Pro Arg Pro Ser Trp Phe Leu Asn Asn Met He He Ser 
340 345 350 

Thr Gly Ser Leu Thr Leu Pro Val Ser Pro Ser Thr Asp Arg Ala Arg 
355 360 365 

Val Trp Tyr Gly Ser Arg Asp Arg He Ser Pro Ala Asn Ser Gin Phe 
370 375 380 

He Thr Glu Leu He Ser Gly Gin His Thr Thr Ala Thr Gin Thr He 
385 390 395 400 

Leu Gly Arg Asn He Phe Arg Val Asp Ser Gin Ala Cys Asn Leu Asn 
405 410 415 

Asp Thr Thr Tyr Gly Val Asn Arg Ala Val Phe Tyr His Asp Ala Ser 
420 425 430 

Glu Gly Ser Gin Arg Ser Val Tyr Glu Gly Tyr He Arg Thr Thr Gly 
435 440 445 

He Asp Asn Pro Arg Val Gin Asn He Asn Thr Tyr Leu Pro Gly Glu 
450 455 460 

Asn Ser Asp He Pro Thr Pro Glu Asp Tyr Thr His He Leu Ser Thr 
465 470 475 480 

Thr He Asn Leu Thr Gly Gly Leu Arg Gin Val Ala Ser Asn Arg Arg 
485 490 495 



Ser Ser Leu Val Met Tyr Gly Trp Thr His Lys Ser Leu Ala Arg Asn 
500 505 510 



Asn Thr He Asn Pro Asp Arg He Thr Gin He Pro Leu Thr Lys Val 
515 520 525 
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Asp Thr Arg Gly Thr Gly Val Ser Tyr Val Asn Asp Pro Gly Phe He 
530 535 540 

Gly Gly Ala Leu Leu Gin Arg Thr Asp His Gly Ser Leu Gly Val Leu 
545 550 555 560 

Arg Val Gin Phe Pro Leu His Leu Arg Gin Gin Tyr Arg He Arg Val 
565 570 575 

Arg Tyr Ala Ser Thr Thr Asn He Arg Leu Ser Val Asn Gly Ser Phe 
580 585 590 

Gly Thr He Ser Gin Asn "Leu Pro Ser Thr Met Arg Leu Gly Glu Asp"' 
595 ^ 600 605 

Leu Arg Tyr Gly Ser Phe Ala He Arg Glu Phe Asn Thr Ser He Arg 
610 615 620 

Pro Thr Ala Ser Pro Asp Gin He Arg Leu Thr He Glu Pro Ser Phe 
625 630 635 640 

He Arg Gin Glu Val Tyr Val Asp Arg He Glu Phe He Pro Val Asn 
645 650 655 

Pro Thr Arg Glu Ala Lys Glu Asp Leu Glu Ala Ala Lys Lys Ala Val 
660 665 670 

Ala Ser Leu Phe Thr Arg Thr Arg Asp Gly Leu Gin Val Asn Val Lys 
675 680 685 

Asp Tyr Gin Val Asp Gin Ala Ala Asn Leu Val Ser Cys Leu Ser Asp 
690 695 700 

Glu Gin Tyr Gly Tyr Asp Lys Lys Met Leu Leu Glu Ala Val Arg Ala 
705 710 715 720 

Ala Lys Arg Leu Ser Arg Glu Arg Asn Leu Leu Gin Asp Pro Asp Phe 
725 730 735 

Asn Thr He Asn Ser Thr Glu Glu Asn Gly Trp Lys Ala Ser Asn Gly 
740 745 750 

Val Thr He Ser Glu Gly Gly Pro Phe Tyr Lys Gly Arg Ala He Gin 
755 760 765 

Leu Ala Ser Ala Arg Glu Asn Tyr Pro Thr Tyr He Tyr Gin Lys Val 
770 775 780 

Asp Ala Ser Glu Leu Lys Pro Tyr Thr Arg Tyr Arg Leu Asp Gly Phe 
785 790 795 800 

Val Lys Ser Ser Gin Asp Leu Glu He Asp Leu He His His His Lys 
805 810 815 
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Val His Leu Val Lys Asn Val Pro Asp Asn Leu Val Ser Asp Thr Tyr 
820 825 830 

Pro Asp Asp Ser Cys Ser Gly He Asn Arg Cys Gin Glu Gin Gin Met 
835 840 845 

Val Asn Ala Gin Leu Glu Thr Glu Hi.s His His Pro Met Asp Cys Cys 
850 855 860 

Glu Ala Ala Gin Thr His Glu Phe Ser Ser Tyr He Asp Thr Gly Asp 
865 870 875 880 

Leu Asn Ser^S^r Val Asp Gin Gly He Trp Ala He Phe-Lys Val Arg 
885 _ 890 895 

Thr Thr Asp Gly Tyr Ala Thr Leu Gly Asn Leu Glu Leu Val Glu Val 
900 905 910 

Gly pro Leu Ser Gly Glu Ser Leu Glu Arg Glu Gin Arg Asp Asn Thr 
915 920 925 

Lvs Trp Ser Ala Glu Leu Gly Arg Lys Arg Ala Glu Thr Asp Arg Val 
93? 940 

Tyr Gin Asp Ala Lys Gin Ser He Asn His Leu Phe Val Asp Tyr Gin 
945 950 955 

ASP Gin Gin Leu Asn Pro Glu He Gly Met Ala Asp He Met Asp Ala 
965 970 975 

Gin Asn Leu Val Ala Ser He Ser Asp Val Tyr Ser Asp Ala Val Leu 
980 985 990 

Gin He pro Gly He Asn Tyr Glu He Tyr Thr Qlu Leu Ser Asn Arg 
995 1000 1005 

Leu Gin Gin Ala Ser Tyr Leu Tyr Thr Ser Arg Asn Ala Val Gin Asn 
1010 1015 1020 

Gly ASP Phe Asn Asn Gly Leu Asp Ser Trp Asn Ala Thr Ala Gly Ala 
1025 1030 1035 1040 

ser val Gin Gin Asp Gly Asn Thr His Phe Leu Val Leu Ser His Trp 
1045 1050 1055 

Asp Ala Gin val Ser Gin Gin Phe Arg Val Gin Pro Asn Cys Lys Tyr 
1060 1065 1070 

Val Leu Arg Val Thr Ala Qlu Lys Val Gly Gly Gly Asp Gly Tyr Val 
1075 1080 1085 

Thr He Arg Asp Asp Ala His His Thr Glu Thr Leu Thr Phe Asn Ala 
1090 1095 1100 
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Cys ASP Tyr Asp He Asn Gly Thr Tyr Val Thr Asp Asn Thr Tyr Leu 
1105 1110 ^^^^ 

Thr Lys Glu Val Val Phe His Pro Glu Thr Gin His Met Trp Val Glu 
1125 1130 1135 

val Asn Glu Thr Glu Gly Ala Phe His He Asp Ser He Glu Phe Val 
1140 1145 1150 



Glu Thr Glu Lys 
1155 

(2) ' INFORMATION FOR SBQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS:^ 

(A) LENGTH: 3471 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 71: 
ATGAATCAAA ATAAACACGG AATTATTGGC GCTTCCAATT GTGGTTGTGC ATCTGATGAT 
GTTGCGAAAT ATCCTTTAGC CAACAATCCA TATTCATCTG CTTTAAATTT AAATTCTTGT 
CAAAATAGTA GTATTCTCAA CTGGATTAAC ATAATAGGCG ATGCAGCAAA AGAAGCAGTA 
TCTATTGGGA CAACCATAGT CTCTCTTATC ACAGCACCTT CTCTTACTGG ATTAATTTCA 
ATAGTATATG ACCTTATAGG TAAAGTACTA GGAGGTAGTA GTGGACAATC CATATCAGAT 
TTGTCTATAT GTCACTIATT ATCTATTATT GATTTACGGG TAAGTCAGAG TGTTTTAAAT 
GATCGGATTG CAGATTTTAA TGGTTCTCTA CTCTTATACA GGAACTATTT AGAGGCTCTG 
6ATAGCTGGA ATAAGAATCC TAATTCTGCT TCTGCTGAAG AACTCCGTAC TCGTTTrAGA 
ATCGCCGACT CAGAATTTGA TAGAATTTTA ACCCGAGGGT CTTTAACGAA TGGTGGCTCG 
rTAGCTAGAC AAAATGCCCA AATATTATTA TTACCTTCTT TTGCGAGCGC TGCATTTTTC 
CATTTATTAC TACTAAGGQA TGCTACTAGA TATGGCACTA ATTGGGGGCT ATACAATGCT 
ACACCTTTTA TAAATTATCA ATCAAAACTA GTAGAGCITA TTGAACTATA TACTGATTAT 
TGCGTACATT GGTATAATCG AGGTTTCAAC GAACTAAGAC AACGAGGCAC TAGTGCTACA 
GCTTGGTTAG AATTTCATAG ATATCGTAGA GAGATGACAT TGATGGTATT AGATATAGTA 
' GCATCATTTT CAAGTCTTGA TATTACTAAT TACCCAATAG AAACAGATTT TCAGTTGAGT 
AGGGTCATTT ATACAGATCC AATTGGTTTT GTACATCGTA GTAGTCTTAG GGGAGAAAGT 
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TGGTTTAGCT TTGTTAATAG AGCTAATTTC TCAGATTTAG AAAATGCAAT ACCTAATCCT 
AGACCGTCTT GGTTTTTAAA TAATATGATT ATATCTACTG GTTCACTTAC ATTGCCGGTT 
AGCCCAAGTA CTGATAGAGC GAGGGTATGG TATGGAAGTC GAGATCGAAT TTCCCCTGCT 
AATTCACAAT TTATTACTGA ACTAATCTCT GGACAACATA CGACTGCTAC ACAAACTATT 
TTAGGGCGAA ATATATTTAG AGTAGATTCT CAA6CTTGTA ATTTAAATGA TACCACATAT 
GGAGTGAATA GGGCGGTATT TTATCATGAT GCGAGTGAAG GTTCTCAAAG ATCCGTGTAC 
GAGGGGTATA TTCGAACAAC TGGGATAGAT AACCCTAGAG TTCAAAATAT TAACACTTAT 
TTACCTGGAG AAAATTCAGA TATCCCAACT CCAGAAGACT ATACTCATAT ATTAAGCACA 
ACAATAAATT TAACAGGAGG ACTTAGACAA GTAGCATCTA ATCGCCGTTC ATCTTTAGTA 
ATGTATGGTT GGACACATAA AAGTCTGGCT CGTAACAATA CCATTAATCC AGATAGAATT 
ACACAGATAC CATTGACGAA GGTTGATACC CGAGGCACAG GTGTTTCTTA TGTGAATGAT 
CCAGGATTTA TAGGAGGAGC TCTACTTCAA AGGACTGACC ATGGTTCGCT TGGAGTATTG 
AGGGTCCAAT TTCCACTTCA CTTAAGACAA CAATATCGTA TTAGAGTCCG TTATGCTTCT 
ACAACAAATA TTCGATTGAG TGTGAATGGC AGTTTCGGTA CTATTTCTCA AAATCTCCCT 
AGTACAATGA GATTAGGAGA GGATTTAAGA TACGGATCTT TTGCTATAAG AGAGTTTAAT 
ACTTCTATTA GACCCACTGC AAGTCCGGAC CAAATTCGAT TGACAATAGA ACCATCTTTT 
ATTAGACAAG AGGTCTATGT AGATAGAATT GAGTTCATTC CAGTTAATCC GACGCGAGAG 
GCGAAAGAGG ATCTAGAAGC AGCAAAAAAA GCGGTGGCGA GCTTGTTTAC ACGCACAAGG 
GACGGATTAC AAGTAAATGT GAAAGATTAT CAAQTCGATC AAGCGGCAAA TTTAGTGTCA 
TGCTTATCAG ATGAACAATA TGGGTATGAC AAAAAGATGT TATTGGAAGC GGTACGTGCG 
GCAAAACQAC TTAGCCGAGA ACGCAACTTA CTTCAGGATC CAGATTTTAA TACAATCAAT 
AGTACAGAAG AAAATG6ATG GAAAGCAAGT AACGGCGTTA CTATTAGTGA GGGCGGGCCA 
TTCTATAAAG GCCGTGCAAT TCAGCTAGCA AGTGCACGAG AAAATTACCC AACATACATC 
TATCAAAAAG TAGATGCATC GGAGTTAAAG CCGTATACAC GTTATAGACT GGATGGGTTC 
GTGAAGAGTA GTCAAGATTT AGAAATTGAT CTCATTCACC ATCATAAAGT CCATCTTGTG 
AAAAATGTAC CAGATAATTT AGTATCTGAT ACTTACCCAG ATGATTCTTG TAGTGGAATC 
AATCGATGTC AGGAACAACA GATGGTAAAT QCGCAACTGG AAACAGAGCA TCATCATCCG 
ATGGATTGCT GTGAAGCAGC TCAAACACAT GAGTTTTCTT CCTATATTGA TACAGGGGAT 
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2460 

2520 

2580 

2640 
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2700 
2760 
2820 
2880 



TTAAATTCGA GTGTAGACCA GGGAATCTGG GCQATCTTTA AAGTTC6AAC AACCGATGGT 
TATGCGACGT TAGGAAATCT TGAATTGGTA GAGGTCGGAC CGTTATCGGG TGAATCTTTA 
GAACQTQAAC AAAQGOATAA TACAAAATGG AQTGCAGAGC TAGGAAGAAA GCGTGCAGAA 
ACAGATCGCG TGTATCAAQA TCCCAAACAA TCCATCAATC ATTTATTTGT GGArTATCAA 
GATCAACAAT TAAATCCAGA AATAGOOATQ GCAOATATTA TGGACGCTCA AAATCTTGTC 2940 
GCATCAATTT CAGATGTATA TAGCGATGCC GTACTGCAAA TCCCTGGAAT TAACTATGAG 3000 
ATTTACACAQ AGCTGTCCAA TCGCTTACAA CAAGCATCGTATCTGTATAC GTCTC6AAAT 
GCGGTGCAAA ATGGGGACTT TAACAACGGG CTAGATA6CT GGAATGCAAC AGCGGGTGCA 
TC6GTACAAC AGGATGOCAA TACQCATTTC TTAGTTCTTT CTCATTGGGA TGCACAAOTT 3180 
TCTCAACAAT TTAGAGTGCA GCCGAATTGT AAATATGTAT TACGTGTAAC AGCAGAGAAA 
GTAGGCGGCG GAQACGGATA CGTGACTATC CGGGATGATG CTCATCATAC AGAAACGCTT 
ACATTTAATG CATGTGATTA TGATATAAAT GGCACGTACG TGACTGATAA TACGTATCTA 
ACAAAAGAAG TGGTATTCCA TCCGGAGACA CAACACATGT QGGTAGAGGT AAATGAAACA 
GAAGQTGCAT TTCATATAGA TAGTATTGAA TTCGTTGAAA CAGAAAAGTA A 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1156 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

Met Asn Arg Asn Asn Gin Asn Qlu Tyr Glu He He Asp Ala Pro His 
1.5 10 15 

Cys Gly Cys Pro Ser Asp Asp Asp Val Arg Tyr Pro Leu Ala Ser Asp 
20 25 30 

Pro Asn Ala Ala Leu Gin Asn Met Asn Tyr Lys Asp Tyr Leu Gin Met 
35 40 *^ 

Thr ASP Glu Asp Tyr Thr Asp Ser Tyr He Asn Pro Ser Leu Ser He 
50 55 60 

ser Gly Arg Asp Ala Val Gin Thr Ala Leu Thr Val Val Gly Arg He 
65 70 75 BO 



3060 
3120 



3240 
3300 
3360 
3420 
3471 
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Leu Gly Ala Leu Gly Val Pro Phe Ser Gly Gin He Val Ser Phe Tyr 
85 90 

Gin Phe Leu Leu Asn Thr Leu Trp Pro Val Asn Asp Thr Ala He Trp 
100 105 

Glu Ala Phe Met Arg Gin Val Glu Glu Leu Val Asn Gin Gin He Thr 
115 120 125 

Glu Phe Ala Arg Asn Gin Ala Leu Ala Arg Leu Gin Gly Leu Gly Asp 
130 135 140 



Ser 
145 



Phe Asn Val Tyr Gin Arg Ser Leu Gin Asn Trp Leu Ala^sp Arg 
150 155— 1«0 



Asn Asp Thr Arg Asn Leu Ser Val Val Arg Ala Gin Phe He Ala Leu 
165 170 175 

ASP Leu Asp Phe val Asn Ala He Pro Leu Phe Ala Val Asn Gly Gin 
180 185 190 

Oln val Pro Leu Leu Ser Val Tyr Ala Gin Ala Val Asn Leu His Leu 
195 200 205 

Leu Leu Leu Lys Asp Ala Ser Leu Phe Gly Glu Gly Trp Gly Phe Thr 
210 215 220 

Gin Gly Glu He Ser Thr Tyr Tyr Asp Arg Gin Leu Glu Leu Thr Ala 



225 



230 235 240 



Lys Tyr Thr Asn Tyr Cys Glu Thr Trp Tyr Asn Thr Gly Leu Asp Arg 
245 250 255 

Leu Arg Gly Thr Asn Thr Glu Ser Trp Leu Arg Tyr His Gin Phe Arg 
260 265 270 

Arq Glu Met Thr Leu Val Val Leu Asp Val Val Ala Leu Phe Pro Tyr 
275 280 285 

Tvr ASP val Arg Leu Tyr Pro Thr Gly Ser Asn Pro Gin Leu Thr Arg 
290 295 300 

Glu val Tyr Thr Asp Pro He Val Phe Asn Pro Pro Ala Asn Val Gly 



305 



310 315 320 



Leu cys Arg Arg Trp Gly Thr Asn Pro Tyr Asn Thr Phe Ser Glu Leu 
325 330 335 

Glu Asn Ala Phe He Arg Pro Pro His Leu Phe Asp Arg Leu Asn Ser 
340 345 350 

Leu Thr He Ser Ser Asn Arg Phe Pro Val Ser Ser Asn Phe Met Asp 
355 360 365 



PCT/US98/26585 

WO 99/33991 

Tyr Trp Ser Gly His Thr Leu Arg Arg Ser Tyr Leu Asn Asp Ser Ala 
370 375 380 

Val Gin Glu Asp Ser Tyr Gly Leu He Thr Thr Thr Arg Ala Thr lie 
385 390 395 400 

Asn Pro Gly Val Asp Gly Thr Asn Arg He Glu Ser Thr Ala Val Asp 
405 410 415 

Phe Arg Ser Ala Leu He Gly He Tyr Gly Val Asn Arg Ala Ser Phe 
420 425 430 

val pro Gly Gly Leu Phe Asn Gly Thr Thr Ser^ro Ala Asn-Qly Gly 
435 440— 445 

Cys Arg Asp Leu Tyr Asp Thr Asn Asp Glu Leu Pro Pro Asp Glu Ser 
' 450 455 460 

Thr Gly ser Ser Thr His Arg Leu Ser His Val Thr Phe Phe Ser Phe 
465 470 475 480 

Gin Thr Asn Gin Ala Gly Ser He Ala Asn Ala Gly Ser Val Pro Thr 
485 490 495 

Tyr Val Trp Thr Arg Arg Asp Val Asp Leu Asn Asn Thr He Thr Pro 
500 505 510 

Asn Arg He Thr Gin Leu Pro Leu Val Lys Ala Ser Ala Pro Val Ser 
515 520 525 

Gly Thr Thr Val Leu Lys Gly Pro Gly Phe Thr Gly Gly Gly He Leu 
530 535 540 

Arg Arg Thr Thr Asn Gly Thr Phe Gly Thr Leu Arg Val Thr Val Asn 
545 550 555 

ser pro Leu Thr Gin Arg Tyr Arg Val Arg Val Arg Phe Ala Ser Ser 
565 570 575 

Gly Asn Phe Ser He Arg He Leu Arg Gly Asn Thr Ser He Ala Tyr 
580 585 590 

Gin Arg Phe Gly Ser Thr Met Asn Arg Gly Gin Glu Leu Thr Tyr Glu 
595 600 

ser Phe Val Thr Ser Glu Phe Thr Thr Asn Gin Ser Asp Leu Pro Phe 
610 615 "0 



Thr Phe Thr Gin Ala Gin Glu Asn Leu Thr He Leu Ala Glu Gly Val 
625 630 635 640 

ser Thr Gly Ser Glu Tyr Phe He Asp Arg He Glu He He Pro Val 
645 650 655 
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Asn Pro Ala Arg Glu Ala Glu Glu Asp Leu Glu Ala Ala Lys Lys Ala 
660 665 670 

Val Ala Asn Leu Phe Thr Arg Thr Arg Asp Gly Leu Gin Val Asn Val 
675 680 685 

Thr Asp Tyr Gin Val Asp Gin Ala Ala Asn Leu Val Ser Cys Leu Ser 
690 695 700 

Asp Glu Gin Tyr Gly His Asp Lys Lys Met Leu Leu Glu Ala Val Arg 
705 710 715 720 

Ala Ala-iays Arg Leu Ser Arg Glu Arg Asn Leu^Leu Gin Asp Pro Asp 
725 " 730 735 

Phe Asn Thr He Asn Ser Thr Glu Glu Asn Gly Trp Lys Ala Ser Asn 
740 745 750 

Gly Val Thr He Ser Glu Gly Gly Pro Phe Phe Lys Gly Arg Ala Leu 
755 760 765 

Gin Leu Ala Ser Ala Arg Glu Asn Tyr Pro Thr Tyr He Tyr Gin Lys 
770 775 780 

Val Asp Ala Ser Val Leu Lys Pro Tyr Thr Arg Tyr Arg Leu Asp Gly 
785 790 795 800 

Phe Val Lys Ser Ser Gin Asp Leu Glu He Asp Leu He His His His 
805 810 815 

Lys Val His Leu Val Lys Asn Val Pro Asp Asn Leu Val Ser Asp Thr 
820 825 830 

Tyr Ser Asp Gly Ser Cys Ser Gly He Asn Arg Cys Asp Glu Gin His 
835 840 845 

Gin Val Asp Met Gin Leu Asp Ala Glu His His Pro Met Asp Cys Cys 
850 855 860 

Glu Ala Ala Gin Thr His Glu Phe Ser Ser Tyr He Asn Thr Gly Asp 
865 _ 870 875 880 

Leu Asn Ala Ser Val Asp Gin Gly He Trp Val Val Leu Lys Val Arg 
885 890 895 

Thr Thr Asp Gly Tyr Ala Thr Leu Gly Asn Leu Glu Leu Val Glu Val 
900 905 910 

Gly pro Leu Ser Gly Glu Ser Leu Glu Arg Glu Gin Arg Asp Asn Ala 
915 920 925 

Lys Trp Asn Ala Glu Leu Gly Arg Lys Arg Ala Glu He Asp Arg Val 
930 935 940 
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Tyr Leu Ala Ala Lys Gin Ala He Asn His Leu Phe Val Asp Tyr Gin 
945 950 955 960 

Asp Gin Gin Leu Asn Pro Glu He Gly Leu Ala Glu He Asn Glu Ala 
965 970 975 

Ser Asn Leu Val Glu Ser He Ser Gly Val Tyr Ser Asp Thr Leu Leu 
980 985 990 

Gin He Pro Gly He Asn Tyr Glu He Tyr Thr Glu Leu Ser Asp Arg 
995 1000 1005 

> Leu Gin Gin Ala Ser Tyr Leu TyTrhr Ser Arg Asn Ala Val Gin Asn 
1010 ' 1015 X020 

Gly Asp Phe Asn Ser Gly Leu Asp Ser Trp Asn Thr Thr Met Asp Ala 
1025 1030 1035 1040 

Ser Val Gin Gin Asp Gly Asn Met His Phe Leu Val Leu Ser His Trp 
1045 1050 1055 

Asp Ala Gin Val Ser Gin Gin Leu Arg Val Asn Pro Asn Cys Lys Tyr 
1060 1065 1070 

val Leu Arg Val Thr Ala Arg Lys Val Gly Gly Gly Asp Gly Tyr Val 
1075 1080 1085 

Thr He Arg Asp Gly Ala His His Gin Glu Thr Leu Thr Phe Asn Ala 
1090 1095 1100 

CVS Asp Tyr Asp Val Asn Gly Thr Tyr Val Asn Asp Asn Ser Tyr He 
1105 1110 1115 1^20 

Thr Glu Glu Val Val Phe Tyr Pro Glu Thr Lys His Met Trp Val Glu 
1125 1130 1135 

Val Ser Glu Ser Glu Gly Ser Phe Tyr He Asp Ser He Glu Phe He 
1140 1145 1150 

Glu Thr Gin Glu 
1155 



(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3471 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 
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ATGAATCGAA ATAATCAAAA TGAATATGAA ATTATTGATG CCCCCCATTG TGGGTGTCCA 60 

TCAGATGACG ATGTGAGGTA TCCTTTGGCA AGTGACCCAA ATGCAGCGTT ACAAAATATG 12 0 

AACTATAAAG ATTACTTACA AATGACAGAT GAGGACTACA CTGATTCTTA TATAAATCCT 180 

AGTTTATCTA TTAGTGGTAG AGATGCAGTT CAGACTGCGC TTACTGTTGT TGGGAGAATA 240 

CTCGGGGCTT TAGGTGTTCC QTTTTCTGGA CAAATAGTGA GTTTTTATCA ATTCCTTTTA 300 

AATACACTGT GGCCAGTTAA TGATACAGCT ATATGGGAAG CTTTCATGCG ACAGGTGGAG 360 

GAACTTGTCA ATCAACAAATTACAGAATTT GCAAGAAATC AGGCACTTGC AAGATTGCAA 420 

GGATTAGGAG ACTCTTTTAA TGTATATCAA CGTTCCCTTC AAA^TTGGTT GGCTGATCGA 480 

AATGATACAC GAAATTTAA6 TGTTGTTCGT GCTCAATTTA TAGCTTTAGA CCTTGATTTT 540 

GTTAATGCTA TTCCATTGTT TGCAGTA?^T GGACAGCAGG TTCCATTACT GTCAGTATAT 600 

GCACAAGCTG TGAATTTACA TTTGTTATTA TTAAAAGATG CATCTCTTTT TGGAGAAGGA 660 

TGGGGATTCA CACAGGGGGA AATTTCCACA TATTATGACC GTCAATTGGA ACTAACCGCT 720 

AAGTACACTA ATTACTGTGA AACTTGGTAT AATACAGGTT TAGATCGTTT AAGAGGAACA 780 

AATACTGAAA GTTGGTTAAG ATATCATCAA TTCCGTAGAG AAATGACTTT AGTGGTATTA 840 

GATGTTGTGG CGCTATTTCC ATATTATGAT GTACGACTTT ATCCAACGGG ATCAAACCCA 900 

CAGCTTACAC GTGAGGTATA TACAGATCCG ATTGTATTTA ATCCACCAGC TAATGTTGGA 960 

CTTTGCCGAC GTTGGGGTAC TAATCCCTAT AATACTTTTT CTGAGCTCGA AAATGCCTTC 1020 

ATTCGCCCAC CACATCTTTT TGATAGGCTG AATAGCTTAA CAATCAGCAG TAATCGATTT 1080 

CCAGTTTCAT CTAATTTTAT GGATTATTGG TCAGGACATA CGTTACGCCG TAGTTATCTG 1140 

AACGATTCAG CAGTACAAGA AGATAGTTAT GGCCTAATTA CAACCACAAG AGCAACAATT 1200 

AATCCTGGAG TTGATGGAAC AAACCGCATA GAGTCAACGG CAGTAGATTT TCGTTCTGCA 1260 

TTGATAGGTA TATATGGCGT GAATAGAGCT TCTTTTGTCC CAGGAGGCTT GTTTAATGGT 1320 

ACGACTTCTC CTGCTAATGG AGGATGTAGA GATCTCTATG ATACAAATGA TGAATTACCA 1380 

CCAGATGAAA GTACCGGAAG TTCTACCCAT AGACTATCTC ATGTTACCTT TTTTAGTTTT 1440 

CAAACTAATC AGGCTGGATC TATAGCTAAT GCAGGAAGTG TACCTACTTA TGTTTGGACC 1500 

CGTCGTGATG TGGACCTTAA TAATACGATT ACCCCAAATA GAATTACACA ATTACCATTG 1560 

GTAAAGGCAT CTGCACCTGT TTCGGGTACT ACGGTCTTAA AAGGTCCAGG ATTTACAGGA 1620 

GGGGGTATAC TCCGAAGAAC AACTAATGGC ACATTTGGAA CGTTAAGAGT AACAGTTAAT 1680 
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TCACCATTAA CACAAAGATA TCGCX3TAAGA GTTCGTTTTG CTTCATCAGG AAATTTCAGC 1740 

ATAAGGATAC TGCGTGGAAA TACCTCTATA GCTTATCAAA GATTTGGGAG TACAATGAAC 1800 

AGAGGACAGG AACTAACTTA CGAATCATTT GTCACAAGTG AGTTCACTAC TAATCAGAGC 1860 

GATCTGCCTT TTACATTTAC ACAAGCTCAA GAAAATTTAA CAATCCTTGC AGAAGGTGTT 1920 

AGCACCGGTA GTGAATATTT TATAGATAGA ATTGAAATCA TCCCTGTGAA CCCGGCACGA 1980 

GAAGCAGAAG AGGATTTAGA AGCAGCGAAG AAAGCGGTGG CGAACTTGTT TACACGTACA 2040 

AGGGACGGAT~fACAGGTAAA TGTGACAGAT TATCAAGTGG ACCAAGCGGC AAATTTAGTG 2100 

TCATGCTTAT CCGATGAACA ATATGGGCAT GACAAAAAGA TGTTATTGGA AGCGGTAAGA 2160 

GCGGCAAAAC GCCTCAGCCG CGAACGCAAC TTACTTCAAG ATCCAGATTT TAATACAATC 2220 

AATAGTACAG AAGAGAATGG CTGGAAGGCA AGTAACGGTG TTACTATTAG CGAGGGCGGT 2280 

CCATTCTTTA AAGGTCGTGC ACTTCAGTTA GCAAGCGCAA GAGAAAATTA TCCAACATAC 2340 

ATTTATCAAA AAGTAGATGC ATCGGTGTTA AAGCCTTATA CACGCTATAG ACTAGATGGA 2400 

TTTGTGAAGA GTAGTCAAGA TTTAGAAATT GATCTCATCC ACCATCATAA AGTCCATCTT 2460 

GTAAAAAATG TACCAGATAA TTTAGTATCT GATACTTACT CAGATGGTTC TTGCAGCGGA 2520 

ATCAACCGTT GTGATGAACA GCATCAGGTA GATATGCAGC TAGATGCGGA GCATCATCCA 2580 

ATGGATTGCT GTGAAGCGGC TCAAACACAT GAGTTTTCTT CCTATATTAA TACAGGGGAT 2640 

CTAAATGCAA GTGTAGATCA GGGCATTTGG GTTGTATTAA AAGTTCGAAC AACAGATGGG 2700 

TATGCGACGT TAGGAAATCT TGAATTGGTA GAGGTTGGGC CATTATCGGG TGAATCTCTA 2760 

GAACGGGAAC AAAGAGATAA TGCGAAATGG AATGCAGAGC TAGGAAGAAA ACGTGCAGAA 2820 

ATAGATCGTG TGTATTTAGC TGCGAAACAA GCAATTAATC ATCTGTTTGT AGACTATCAA 2880 

GATCAACAAT TAAATCCAGA AATTGGGCTA GCAGAAATTA ATGAAGCTTC AAATCTTGTA 2940 

GAGTCAATTT CG6GTGTATA TAGTGATACA CTATTACAGA TTCCTGGGAT TAACTACGAA 3000 

ATTTACACAG AGTTATCCGA TCGCTTACAA CAAGCATCGT ATCTGTATAC GTCTAGAAAT 3060 

GCGGTGCAAA ATGGAGACTT TAACAGTGGT CTAGATAGTT GGAATACAAC TATGGATGCA 3120 

TCGGTTCAGC AAGATQGCAA TATGCATTTC TTAGTTCTTT CGCATTGGGA TGCACAAGTT 3180 

TCCCAACAAT TGAGAGTAAA TCCGAATTGT AAGTATGTCT TACGTGTGAC AGCAAGAAAA 3240 

GTAGGAGGCG GAGATGGATA CGTCACAATC CGAGATGGCG CTCATCACCA AGAAACTCTT 3300 

ACATTTAATG CATGTGACTA CGATGTAAAT GGTACGTATG TCAATGACAA TTCGTATATA 3360 
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ACAGAAGAAG TGGTATTCTA CCCAGAGACA AAACATATGT GGGTAGAGGT GAGTGAATCC 3420 
GAAGGTTCAT TCTATATAGA CAGTATTGAG TTTATTGAAA CACAAGAGTA G 3471 



(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1150 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

Met Asn Arg Asn Asn Pro Asn Glu Tyr Glu lie lie Asp Ala Pro Tyr 
1 5 10 15 

Cys Gly Cys Pro Ser Asp Asp Asp Val Arg Tyr Pro Leu Ala Ser Asp 
20 25 30 

Pro Asn Ala Ala Phe Gin Asn Met Asn Tyr Lys Glu Tyr Leu Gin Thr 
35 40 45 

Tyr Asp Gly Asp Tyr Thr Gly Ser Leu lie Asn Pro Asn Leu Ser lie 
50 55 60 

Asn Pro Arg Asp Val Leu Gin Thr Gly lie Asn lie Val Gly Arg lie 
65 70 75 80 



Leu Gly Phe Leu Gly Val Pro Phe 
85 

Thr Phe Leu Leu Asn Gin Leu Trp 
100 

Glu Ala Phe Met Ala Gin lie Glu 
115 120 

Ala Gin Val Val Arg Asn Ala Leu 
130 135 

Tyr Tyr Glu Glu Tyr Leu Ala Ala 
145 150 

Asn Gly Ala Arg Ala Asn Leu Val 
165 

Thr Ala Phe Val Thr Arg Met Pro 
180 

Gin Arg Asp Ala Val Ala Leu Leu 
195 200 



Ala Gly Gin Leu Val Thr Phe Tyr 
90 95 

Pro Thr Asn Asp Asn Ala Val Trp 
105 110 

Glu Leu lie Asp Gin Lys lie Ser 
125 

Asp Asp Leu Thr Gly Leu His Asp 
140 

Leu Glu Glu Trp Leu Glu Arg Pro 
155 160 

Thr Gin Arg Phe Glu Asn Leu His 
170 175 

Ser Phe Gly Thr Gly Pro Gly Ser 
185 190 

Thr Val Tyr Ala Gin Ala Ala Asn 

205 
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Leu His Leu Leu Leu Leu Lys Asp Ala Glu lie Tyr Gly Ala Arg Trp 
210 215 220 

Gly Leu Gin Gin Gly Gin lie Asn Leu Tyr Phe Asn Ala Gin Gin Glu 
225 230 235 240 

Arg Thr Arg lie Tyr Thr Asn His Cys Val Glu Thr Tyr Asn Arg Gly 
245 250 255 

Leu Glu Asp Val Arg Gly Thr Asn Thr Glu Ser Trp Leu Asn Tyr His 
260 265 270 

Arg Phe Arg Arg Glu Met Thr Leu Met Ala Met Asp Leu Val Ala Leu 
275 280 285 

Phe Pro Phe Tyr Asn Val Arg Gin Tyr Pro Asn Gly Ala Asn Pro Gin 
290 295 300 

Leu Thr Arg Glu lie Tyr Thr Asp Pro lie Val Tyr Asn Pro Pro Ala 
305 310 315 320 

Asn Gin Gly lie Cys Arg Arg Trp Gly Asn Asn Pro Tyr Asn Thr Phe 
325 330 335 

Ser Glu Leu Glu Asn Ala Phe lie Arg Pro Pro His Leu Phe Glu Arg 
340 345 350 

Leu Asn Arg Leu Thr lie Ser Arg Asn Arg Tyr Thr Ala Pro Thr Thr 
355 360 365 

Asn Ser Phe Leu Asp Tyr Trp Ser Gly His Thr Leu Gin Ser Gin His 
370 375 380 

Ala Asn Asn Pro Thr Thr Tyr Glu Thr Ser Tyr Gly Gin lie Thr Ser 
385 390 395 400 

Asn Thr Arg Leu Phe Asn Thr Thr Asn Gly Ala Arg Ala lie Asp Ser 
405 410 4X5 

Arg Ala Arg Asn Phe Gly Asn Leu Tyr Ala Asn Leu Tyr Gly Val Ser 
420 425 430 

Ser Leu Asn lie Phe Pro Thr Gly Val Met Ser Glu lie Thr Asn Ala 
435 440 445 

Ala Asn Thr Cys Arg Gin Asp Leu Thr Thr Thr Glu Glu Leu Pro Leu 
450 455 460 

Glu Asn Asn Asn Phe Asn Leu Leu Ser His Val Thr Phe Leu Arg Phe 
465 470 475 480 



Asn Thr Thr Gin Gly Gly Pro Leu Ala Thr Leu Gly Phe Val Pro Thr 
485 490 495 
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Tyr Val Trp Thr Arg Glu Asp Val Asp Phe Thr Asn Thr He Thr Ala 
500 505 510 

Asp Arg He Thr Gin Leu Pro Trp Val Lys Ala Ser Glu He Gly Gly 
515 520 525 

Gly Thr Thr Val Val Lys Gly Pro Gly Phe Thr Gly Gly Asp He Leu 
530 535 540 

Arg Arg Thr Asp Gly Gly Ala Val Gly Thr He Arg Ala Asn Val Asn 
545 550 555 560 

Ala Pro Leu Thr Gin Gin Tyr Arg He Arg Leu Arg Tyr Ala Ser Thr 
565 570 ^575 

Thr Ser Phe Val Val Asn Leu Phe Val Asn Asn Ser Ala Ala Gly Phe 
580 585 590 

Thr Leu Pro Ser Thr Met Ala Gin Asn Gly Ser Leu Thr Tyr Glu Ser 
595 600 605 

Phe Asn Thr Leu Glu Val Thr His Thr He Arg Phe Ser Gin Ser Asp 
610 615 620 

Thr Thr Leu Arg Leu Asn He Phe Pro Ser He Ser Gly Gin Glu Val 
625 630 635 640 

Tyr Val Asp Lys Leu Glu He Val Pro He Asn Pro Thr Arg Glu Ala 
645 650 655 

Glu Glu Asp Leu Glu Asp Ala Lys Lys Ala Val Ala Ser Leu Phe Thr 
660 665 670 

Arg Thr Arg Asp Gly Leu Gin Val Asn Val Thr Asp Tyr Gin Val Asp 
675 680 685 

Gin Ala Ala Asn Leu Val Ser Cys Leu Ser Asp Glu Gin Tyr Gly His 
690 695 700 

Asp Lys Lys Met Leu Leu Glu Ala Val Arg Ala Ala Lys Arg Leu Ser 
70S . 710 715 720 

Arg Glu Arg Asn Leu Leu Gin Asp Pro Asp Phe Asn Glu He Asn Ser 
725 730 735 

Thr Glu Glu Asn Gly Trp Lys Ala Ser Asn Gly Val Thr He Ser Glu 
740 745 750 



Gly Gly Pro Phe Phe Lys Gly Arg Ala Leu Gin Leu Ala Ser Ala Arg 
755 760 765 



Glu Asn. Tyr Pro Thr Tyr He Tyr Gin Lys Val Asp Ala Ser Thr Leu 
770 775 780 
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Lys Pro Tyr Thr Arg Tyr Lys Leu Asp Gly Phe Val Gin Ser Ser Gin 
785 790 795 800 

Asp Leu Glu lie Asp Leu lie His His His Lys Val His Leu Val Lys 
805 810 815 

Asn Val Pro Asp Asn Leu Val Ser Asp Thr Tyr Ser Asp Gly Ser Cys 
820 825 830 

Ser Gly lie Asn Arg Cys Glu Glu Gin His Gin Val Asp Val Gin Leu 
835 840 845 

Asp Ala Glu Asp His Pro Lys Asp Cys Cys Glu Ala Ala Gin Thr His 
850 855 860 

Glu Phe Ser Ser Tyr He His Thr Gly Asp Leu Asn Ala Ser Val Asp 
865 870 875 880 

Gin Gly He Trp Val Val Leu Gin Val Arg Thr Thr Asp Gly Tyr Ala 
885 890 895 

Thr Leu Gly Asn Leu Glu Leu Val Glu Val Gly Pro Leu Ser Gly Glu 
900 905 910 

Ser Leu Glu Arg Glu Gin Arg Asp Asn Ala Lys Trp Asn Glu Glu Val 
915 920 925 

Gly Arg Lys Arg Ala Glu Thr Asp Arg He Tyr Gin Asp Ala Lys Gin 
930 935 940 

Ala He Asn His Leu Phe Val Asp Tyr Gin Asp Gin Gin Leu Ser Pro 
945 950 955 960 

Glu Val Gly Met Ala Asp He He Asp Ala Gin Asn Leu He Ala Ser 
965 970 975 

He Ser Asp Val Tyr Ser Asp Ala Val Leu Gin He Pro Gly He Asn 
980 985 990 

Tyr Glu Met Tyr Thr Glu Leu Ser Asn Arg Leu Gin Gin Ala Ser Tyr 
995 1000 1005 

Leu Tyr Thr Ser Arg Asn Val Val Gin Asn Gly Asp Phe Asn Ser Gly 
1010 1015 1020 

Leu Asp Ser Trp Asn Ala Thr Thr Asp Thr Ala Val Gin Gin Asp Gly 
1025 1030 1035 1040 

Asn Met His Phe Leu Val Leu Ser His Trp Asp Ala Gin Val Ser Gin 
1045 1050 1055 



Gin Phe Arg Val Gin Pro Asn Cys Lys Tyr Val Leu Arg Val Thr Ala 
1060 1065 1070 
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Lys Lys Val Gly Asn Gly Asp Gly Tyr Val Thr lie Gin Asp Gly Ala 
1075 1080 1085 

His His Arg Glu Thr Leu Thr Phe Asn Ala Cys Asp Tyr Asp Val Asn 
1090 1095 1100 

Gly Thr His Val Asn Asp Asn Ser Tyr lie Thr Lys Glu Leu Val Phe 
1105 1110 1115 1120 

Tyr Pro Lys Thr Glu His Met Trp Val Glu Val Ser Glu Thr Glu Gly 
1125 1130 1135 

Thr Phe Tyr_Ile--Asp Ser lie Glu Phe lie Glu Thr "Gin Glu 

1140 1145 1150 ^ 

(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3453 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 

ATGAATCGAA ATAATCCAAA TGAATATGAA ATTATTGATG CCCCCTATTG TGGGTGTCCG 60 

TCAGATGATG ATGTGAGGTA TCCTTTGGCA AGTGACCCAA ATGCAGCGTT CCAAAATATG 120 

AACTATAAAG AGTATTTACA AACGTATGAT GGAGACTACA CAGGTTCTCT TATCAATCCT 180 

AACTTATCTA TTAATCCTAG AGATGTACTA CAAACAGGTA TTAATATTGT GGGAAGAATA 240 

CTAGGGTTTT TAGGTGTTCC ATTTGCGGGT CAACTAGTTA CTTTCTATAC CTTTCTCTTA 300 

AATCAGTTGT GGCCAACTAA TGATAATGCA GTATGGGAAG CTTTTATGGC GCAAATAGAA 3 60 

GAGCTAATCG ATCAAAAAAT ATCGGCGCAA GTAGTAAGGA ATGCACTCGA TGACTTAACT 420 

GGATTACACG ATTATTATGA GGAGTATTTA GCAGCATTAG AGGAGTGGCT GGAAAGACCG 480 

AACGGAGCAA GAGCTAACTT AGTTACACAG AG6TTTGAAA ACCTGCATAC TGCATTTGTA 540 

ACTAGAATGC CAAGCTTTGG TACGGGTCCT GGTAGTCAAA GAGATGCGGT AGCGTTGTTG 600 

ACGGTATATG CACAAGCAGC GAATTTGCAT TTGTTATTAT TAAAAGATGC AGAAATCTAT 660 

GGGGCAAGAT GGGGACTTCA ACAAGGGCAA ATTAACTTAT ATTTTAATGC TCAACAAGAA 720 

CGTACTCGAA TTTATACCAA TCATTGCGTG GAAACATATA ATAGAGGATT A6AAGATGTA 780 

AGAGGAACAA ATACAGAAAG TTGGTTAAAT TACCATCGAT TCCGTAGAGA GATGACATTA 840 
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ATGGCAATGG ATTTAGTGGC CCTATTCCCA TTCTATAATG TGCGACAATA TCCAAATGGG 900 

GCAAATCCAC AGCTTACACG TGAAATATAT ACAGATCCAA TCGTATATAA TCCACCAGCT 960 

AATCAGGGAA TTTGCCGACG TTGGGGGAAT AATCCGTATA ATACATTTTC TGAACTTGAA 1020 

AATGCTTTTA TTCGCCCGCC ACATCTTTTT GAAAGGTTGA ACAGATTAAC TATTTCTAGA 1080 

AACCGATATA CAGCTCC7UVC AACTAATAGC TTCCTAGACT ATTGGTCAGG TCATACTTTA 1140 

CAAAGCCAAC ATGCAAATAA CCCGACGACA TATGAAACTA GTTACGGTCA GATTACCTCT 1200 

AA'CACACGTT TATTCAATAC GACTAATGGA GCCCGTGCAA TAGATTCSSGT GGCAAGAAAT 1260 

TTTGGTAACT TATACGCTAA TTTGTATGGC GTTAGCAGCT TGAACATTTT CCCAACAGGT 1320 

GTGATGAGTG AAATCACCAA TGCAGCTAAT ACGTGTCX3GC AAGACCTTAC TACAACTGAA 1380 

GAACTACCAC TAGAGAATAA TAATTTTAAT CTTTTATCTC ATGTTACTTT CTTACGCTTC 1440 

AATACTACTC AGGGTGGCCC CCTTGCAACT CTAGGGTTTG TACCCACATA TGTGTGGACA 1500 

CGTGAAGATG TAGATTTTAC GAACACAATT ACTGCGGATA GAATTACACA ACTACCATGG 1560 

GTAAAGGCAT CTGAAATAGG TGGGGGTACT ACTGTCGTGA AAGGTCCAGG ATTTACAGGA 1620 

GGGGATATAC TTCGAAGAAC GGAOGGTGGT GCAGTTGGAA CQATTAQAGC TAATGTTAAT 1680 

GCCCCATTAA CACAACAATA TCGTATAAGA TTACGCTATG CTTCGACAAC AAGTTTTGTT 1740 

GTTAATTTAT TTGTTAATAA TAGTGCGGCT GGCTTTACTT TACCGAGTAC AATGGCTCAA 1800 

AATGGTTCTT TAACATACGA GTCGTTTAAT ACCTTAQAGG TAACTCATAC TATTAGATTT 1860 

TCACAGTCAG ATACTACACT TAGGTTGAAT ATATTCCCGT CTATCTCTGG TCAAGAAGTG 1920 

TATGTAGATA AACTTGAAAT CGTTGCAATT AACCOGACAC GAGAAGC6GA AGAAGATTTA 1980 

GAAGATGCAA AGAAAGCGGT GGCGAGCTTG TTTACACGTA CAAGGGATGG ATTACAGGTA 2040 

AATGTGACAG ATTACCAAGT CGATCAGGCG GCAAATTTAG TGTCGTGCTT ATCAGATQAA 2100 

CAATATGGGC ATGATAAAAA GATGTTATTG GAAGCCGTAC GCGCAGCAAA ACGCCTCAGC 2160 

CGCGAACGCA ACTTACTTCA AGATCCAGAT TTTAATGAAA TAAATAGCAC AGAAGAAAAT 2220 

GGCTGGAAGG CAAGTAACGG TGTTACTATT AGCGAGGGCG GTCCATTCTT TAAAGGTCGT 22 80 

GCACTTCAGT TAGCAAGCGC ACGTGAAAAT TACCCAACAT ACATCTATCA AAAGGTAGAT 2340 

GCATCGACGT TAAAACCTTA TACACGATAT AAACTAGATG GATTTGTGCA AAGTAGTCAA 2400 

GATTTAGAAA TTGACCTCAT TCATCATCAT AAAGTCCACC TCGTGAAAAA TGTACCAGAT 2460 

AATTTAGTAT CTGATACTTA TTCTGATGGC TCATGTAGTG GAATTAACCG TTGTGAGGAA 2520 
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CAACATCAGG TAGATGTGCA GCTAGATGCG GAGGATCATC CAAAGGATTG TTGTGAAGCG 2580 

GCTCAAACAC ATGAGTTTTC TTCCTATATT CATACAGGTG ATCTAAATGC AAGTGTAGAT 2640 

CAAGGCATTT GGGTTGTATT GCAGGTTCGA ACAACAGATG GTTATGCGAC GTTAGGAAAT 2700 

CTTGAATTGG TAGAGGTTGG TCCATTATCG GGTGAATCTT TAGAACGAGA ACAAAGAGAT 2760 

AATGCGAT^T GGAATGAAGA GGTAGGAAGA AAGCGTGCAG AAACAGATCG CATATATCAA 2820 

GATGCGAAAC AAGCAATTAA CCATCTATTT GTAGACTATC AAGATCAACA ATTAAGTCCA 2880 

GAGGTAGGGA TGGCGGAI^AT TATTGATGCT CAAAATCTTA TCGCATCSAT TTCAGATGTA 2940 

TATAGCGATG CAGTACTGCA AATCCCTGGG ATTAACTACG AGATGTATAC AGAGTTATCC 3000 

AATCGATTAC AACAAGCATC GTATCTGTAT ACGTCTCGAA ATGTCGTGCA AAATGGGGAC 3060 

TTTAACAGTQ GTTTAGATAG TTGGAATGCA ACAACTGATA CAGCTGTTCA GCAGGATGGC 3120 

AATATGCATT TCTTAGTTCT TTCCCATTGG GATGCACAAG TTTCTCAACA ATTTAGAGTA 3180 

CAGCCGAATT GTAAATATGT GTTACGTGTG ACAGCGAAGA AAGTAGGGAA CGGAGATGGA 3240 

TATGTTACGA TCCAAGATGG CGCTCATCAC CGAGA/IACAC TGACATTCAA TGCATGTGAC 3300 

TACGATGTAA ATGGTACGCA TGTAAATGAT AATTCGTATA TTACAAAAGA ATTGGTGTTC 3360 

TATCCAAAGA CGGAACATAT GTGGGTAGAG GTAAGTGAAA CAGAAGGTAC CTTCTATATA 3420 

GACAGCATTG AGTTCATTGA AACACAAGAG TAG 3453 

(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1134 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

Met Asp Asn Asn Pro Asn He Asn Glu Cys He Pro Tyr Asn Cys Leu 
15 10 15 

Ser Asn Pro Glu Val Glu Val Leu Gly Gly Glu Arg Gly Asn Val Arg 
20 25 30 

Thr Gly Leu Gin Thr Gly He Asp He Val Ala Val Val Val Gly Ala 
35 40 45 
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Leu Gly Gly Pro Val Gly Gly lie Leu Thr Gly Phe Leu Ser Thr Leu 
50 55 60 

Phe Gly Phe Leu Trp Pro Ser Asn Asp Gin Ala Val Trp Glu Ala Phe 

65 70 75 80 

lie Glu Gin Met Glu Glu Leu lie Glu Gin Arg lie Ser Asp Gin Val 
85 90 95 

Val Arg Thr Ala Leu Asp Asp Leu Thr Gly lie Gin Asn Tyr Tyr Asn 
100 105 110 

"*^Gln Tyr Leu Ile^_Ala Leu Lys Glu Trp Glu Glu Arg PfoH^sn Gly Val 
115 " 120 ' 125 

Arg Ala Asn Leu Val Leu Gin Arg Phe Glu lie Leu His Ala Leu Phe 
130 135 140 

Val Ser Ser Met Pro Ser Phe Gly Ser Gly Pro Gly Ser Gin Arg Phe 
145 150 155 ' 160 

Gin Ala Gin Leu Leu Val Val Tyr Ala Gin Ala Ala Asn Leu His Leu 
165 170 175 

Leu Leu Leu Ala Asp Ala Glu Lys Tyr Gly Ala Arg Trp Gly Leu Arg 
180 185 190 

Glu Ser Gin lie Gly Asn Leu Tyr Phe Asn Glu Leu Gin Thr Arg Thr 
195 200 205 

Arg Asp Tyr Thr Asn His Cys Val Asn Ala Tyr Asn Asn Gly Leu Ala 
210 215 220 

Gly Leu Arg Gly Thr Ser Ala Glu Ser Trp Leu Lys Tyr His Gin Phe 
225 230 235 240 

Arg Arg Glu Ala Thr Leu Met Ala Met Asp Leu lie Ala Leu Phe Pro 
245 250 255 

Tyr Tyr Asn Thr Arg Arg Tyr Pro lie Ala Val Asn Pro Gin Leu Thr 
260 265 270 

Arg Glu Val Tyr Thr Asp Pro Leu Gly Val Pro Ser Glu Glu Ser Ser 
275 280 285 

Leu Phe Pro Glu Leu Arg Cys Leu Arg Trp Gin Glu Thr Ser Ala Met 
290 295 300 

Thr Phe Ser Asn Leu Glu Asn Ala lie lie Ser Ser Pro His Leu Phe 
305 310 315 320 



Asp Thr lie Asn Asn Leu Met lie Tyr Thr Gly Ser Phe Ser Val His 
325 330 335 
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ISO 

Leu Thr Asn Gin Leu lie Glu Gly Trp lie Gly His Ser Val Thr Ser 
340 345 350 

Ser Leu Leu Ala Ser Gly Pro Thr Thr Val Leu Arg Arg Asn Tyr Gly 
355 360 365 

Ser Thr Thr Ser lie Val Asn Tyr Phe Ser Phe Asn Asp Arg Asp Val 
370 375 380 

Tyr Gin lie Asn Thr Arg Ser His Thr Gly Leu Gly Phe Gin Asn Ala 
385 390 395 400 

Pro-Leu Phe Gly lie Thr Arg Ala Gin PEe~Tyr Pro Gly Gly Thr Tyr 
405 410 _ 415 

Ser Val Thr Gin Arg Asn Ala Leu Thr Cys Glu Gin Asn Tyr Asn Ser 
420 425 430 

lie Asp Glu Leu Pro Ser Leu Asp Pro Asn Glu Pro He Ser Arg Ser 
435 440 445 

Tyr Ser His Arg Leu Ser His lie Thr Ser Tyr Leu His Arg Val Leu 
450 455 460 

Thr He Asp Gly He Asn He Tyr Ser Gly Asn Leu Pro Thr Tyr Val 
465 470 475 480 

Trp Thr His Arg Asp Val Asp Leu Thr Asn Thr He Thr Ala Asp Arg 
485 490 495 

He Thr Gin Leu Pro Leu Val Lys Ser Phe Glu He Pro Ala Gly Thr 
500 505 510 

Thr Val Val Arg Gly Pro Gly Phe Thr Gly Gly Asp He Leu Arg Arg 
515 520 525 

Thr Gly Val Gly Thr Phe Gly Thr He Arg Val Arg Thr Thr Ala Pro 
530 535 540 

Leu Thr Gin Arg Tyr Arg He Arg Phe Arg Phe Ala Ser Thr Thr Asn 
545 550 555 560 

Leu Phe He Gly He Arg Val Gly Asp Arg Gin Val Asn Tyr Phe Asp 
565 570 575 

Phe Gly Arg Thr Met Asn Arg Gly Asp Glu Leu Arg Tyr Glu Ser Phe 
580 585 590 

Ala Thr Arg Glu Phe Thr Thr Asp Phe Asn Phe Arg Gin Pro Gin Glu 
595 600 605 



Leu He Ser Val Phe Ala Asn Ala Phe Ser Ala Gly Gin Glu Val Tyr 
610 615 620 
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Phe Asp Arg He Glu He He Pro Val Asn Pro Ala Arg Glu Ala Lys 
625 630 635 640 

Glu Asp Leu Glu Ala Ala Lys Lys Ala Val Ala Ser Leu Phe Thr Arg 
645 650 655 

Thr Arg Asp Gly Leu Gin Val Asn Val Lys Asp Tyr Gin Val Asp Gin 
660 665 670 

Ala Ala Asn Leu Val Ser Cys Leu Ser Asp Glu Gin Tyr Gly Tyr Asp 
675 680 685 

Lys Lys Met Leu" Leu Glu Ala Val Arg Ala Ala Lys Arg Leu Ser Arg 
690 695 700 

Glu Arg Asn Leu Leu Gin Asp Pro Asp Phe Asn Thr He Asn Ser Thr 
705 710 715 720 

Glu Glu Asn Gly Trp Lys Ala Ser Asn Gly Val Thr He Ser Glu Gly 
725 730 735 

Gly Pro Phe Tyr Lys Gly Arg Ala Leu Gin Leu Ala Ser Ala Arg Glu 
740 745 750 

Asn Tyr Pro Thr Tyr He Tyr Gin Lys Val Asp Ala Ser Glu Leu Lys 
755 760 765 

Pro Tyr Thr Arg Tyr Arg Ser Asp Gly Phe Val Lys Ser Ser Gin Asp 
770 775 780 

Leu Glu He Asp Leu He His His His Lys Val His Leu Val Lys Asn 
785 790 795 800 

Val Pro Asp Asn Leu Val Ser Asp Thr Tyr Pro Asp Asp Ser Cys Ser 
805 810 815 

Gly He Asn Arg Cys Gin Glu Gin Gin Met Val Asn Ala Gin Leu Glu 
820 825 830 

Thr Glu His His His Pro Met Asp Cys Cys Glu Ala Ala Gin Thr His 
835 840 845 

Glu Phe Ser Ser Tyr He Asp Thr Gly Asp Leu Asn Ser Ser Val Asp 
850 855 660 

Gin Gly He Trp Ala He Phe Lys Val Arg Thr Thr Asp Gly Tyr Ala 
865 870 875 880 



Thr Leu Gly Asn Leu Glu Leu Val Glu Val Gly Pro Leu Ser Gly Glu 
885 890 895 



Ser Leu Glu Arg Glu Gin Arg Asp Asn Thr Lys Trp Ser Ala Glu Leu 
900 905 910 
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Gly Arg Lys Arg Ala Glu Thr Asp Arg Val Tyr Gin Asp Ala Lys Gin 
915 920 925 

Ser He Asn His Leu Phe Val Asp Tyr Gin Asp Gin Gin Leu Asn Pro 
930 935 940 

Glu He Gly Met Ala Asp He Met Asp Ala Gin Asn Leu Val Ala Ser 
945 950 955 960 

He Ser Asp Val Tyr Ser Asp Ala Val Leu Gin He Pro Gly He Asn 
965 970 975 

Tyr Giu He Tyr Thr Glu Leu Ser Asn Arg Leu Gin Gin Ala Ser Tyr 
980 985 -fl9J0 

Leu Tyr Thr Ser Arg Asn Ala Val Gin Asn Gly Asp Phe Asn Asn Gly 
995 1000 1005 

Leu Asp Ser Trp Asn Ala Thr Ala Gly Ala Ser Val Gin Gin Asp Gly 
1010 1015 1020 

Asn Thr His Phe Leu Val Leu Ser His Trp Asp Ala Gin Val Ser Gin 
1025 1030 1035 1040 

Gin Phe Arg Val Gin Pro Asn Cys Lys Tyr Val Leu Arg Val Thr Ala 
1045 1050 1055 

Glu Lys Val Gly Gly Gly Asp Gly Tyr Val Thr He Arg Asp Gly Ala 
1060 1065 1070 

His His Thr Glu Thr Leu Thr Phe Asn Ala Cys Asp Tyr Asp He Asn 
1075 lOBO 1085 

Gly Thr Tyr Val Thr Asp Asn Thr Tyr Leu Thr Lys Glu Val He Phe 
1090 1095 1100 

Tyr Ser His Thr Glu His Met Trp Val Glu Val Asn Glu Thr Glu Gly 
1105 1110 1115 1120 

Ala Phe His He Asp Ser He Glu Phe Val Glu Thr Glu Lys 
1125 1130 



INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3411 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNHSS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 
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ATGGATAACA ATCCGAACAT CAATGAATGC ATTCCTTATA ATTGTTTAAG TAACCCTGAA 60 

GTAGAAGTAT TAGGTGGAGA AAGAGGAAAT GTTAGAACTG GACTACAAAC TGGAATTGAT 120 

ATTGTTGCAG TAGTAGTAGG TGCTTTAGGT GGACCAGTTG GTGGCATACT CACTGGTTTT 180 

CTTTCTACTC TTTTTGGTTT TCTTTGGCCA TCTAATGATC AAGCAGTATG GGAAGCTTTT 240 

ATAGAACAAA TGGAAGAACT GATTGAACAA AGGATATCAG ATCAAGTAGT AAGGACTGCA 300 

CTCGATGACT TAACTGGAAT TCAAAATTAT TATAATCAAT ATCTAATAGC ATTAAAGGAA 360 

TGGGAGGAAA GACCAAACGG CGTAAGAGCA AACTTAGTTT TGCAAAGATT TGAAATCTTG '4'20 

CACGCGCTAT TTGTAAGTAG TATGCCAAGT TTTGGTAGTG GCCCTGGAAG TCAAAGGTTT 480 

CAGGCACAAT TGTTGGTTGT TTATGCGCAA GCAGCAAATC TTCATTTACT ATTATTAGCT 540 

GATGCTGAAA AGTATGGGGC AAGATGGGGA CTCCGTGAAT CCCAGATAGG AAATTTATAT 600 

TTTAATGAAC TACAAACTCG TACTCGAGAT TACACCAACC ATTGTGTAAA CGCGTATAAT 660 

AACGGGTTAG CCGGGTTACG AGGAACGAGC GCTGAAAGTT GGTTAAAGTA CCATCAATTC 720 

CGCAGAGAAG CAACCTTAAT GGCAATGGAT TTGATAGCTT TATTTCCATA TTATAACACC 780 

CGGCGATATC CAATCXKIAGT AAATCCTCAG CTTACACGTG AGQTATATAC AGATCCATTA 840 

GGCGTTCCTT CTGAAGAATC AAGTTTATTT CCAGAATTGA GATGCTTAAG ATGGCAAGAG 900 

ACTTCTGCCA TGACTTTTTC AAATTTGGAA AATGCAATAA TTTCGTCACC ACATCTATTT 960 

GACACAATAA ACAATTTAAT GATTTATACC GGTTCCTTTT CCGTTCACCT AACCAATCAA 1020 

TTAATTGAAG GGTGGATTGG ACATTCTGTA ACTAGTAGTT TGTTGGCCAG TGGACCAACA 1080 

ACAGTACTGA GAAGAAATTA CGGTAGCACG ACATCTATTG TAAACTATTT TAGTTTTAAT 1140 

GATCGTGAT6 TTTATCAGAT TAATACGAGA TCACATACTG GGTTGGGATT CCAGAACGCA 1200 

CCTTTATTTG GAATCACTAG AGCTCAATTT TACCCAGGTG GGACTTATTC AGTAACTCAA 1260 

CGAAATGCAT TAACATGTGA ACAAAATTAT AATTCAATTG ATGAGTTACC GAGCCTAGAC 1320 

CCAAATGAAC CTATCAGTAG AAGTTATAGT CATAGATTAT CTCATATTAC CTCCTATTTG 1380 

CATCGTGTAT TGACTATTGA TGGTATTAAT ATATATTCAG GAAATCTCCC TACTTATGTA 1440 

TGGACCCATC GCGATGTGGA CCTTACAAAC ACGATTACCG CAGATAGAAT TACACAACTA 1500 

CCATTGGTAA AGTCATTTGA AATACCTGCG GGTACTACTG TCX3TAAQAGQ ACCAGGTTTT 1560 

ACAGGAGGGG ATATACTCCG AAGAACAGGG GTTGGTACAT TTGGAACAAT AAGGGTAAGG 1620 

ACTACTGCCC CCTTAACACA AAGATATCGC ATAAGATTCC GTTTCGCTTC TACCACAAAT 1680 
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TTGTTCATTG GTATAAGAGT TGGTGATAGA CAAGTAAATT ATTTTGACTT 
ATGAACAGAG GAGATGAATT AAGGTACGAA TCTTTTGCTA CAAGGGAGTT 
TTTAATTTTA GACAACCTCA AGAATTAATC TCAGTGTTTG CAAATGCATT 
CAAGAAGTTT ATTTTGATAG AATTQAGATT ATCCCCGTTA ATCCCGCACG 
GAGGATCTAG AAGCAGCAAA GAAAGCGGTG GCGAGCTTGT TTACACGCAC 
TTACAAGTAA ATGTGAAAGA TTATCAAGTC GATCAAGCX5G CAAATTTAGT 
TCAGATGAAC AATATGGGTA TGACAAAAAG ATGTTATTGG AAGCGGTACG 
CGCCTCAGCC GAGAACGTAA CTTACTTCAG GATCCAGATT TTAATACAAT 
GAAGAAAATG GATGGAAAGC AAGTAACGGC GTTACTATTA GTGAGGGCGG 
AAAGGCCGT6 CACTTCAGCT AGCAAGTGCA CGAGAAAATT ATCCAACATA 
AAAGTAGATG CATCGGAGTT AAAACCTTAT ACACGTTATA GATCAGATGG 
AGTAGTCAAG ATTTAGAAAT TGATCTCATT CACCATCATA AAGTCCATCT 
GTACCAGATA ATTTAGTATC TGATACTTAC CCAGATGATT CTTGTAGTGG 
TGTCAGGAAC AACAGATGGT AAATGCGCAA CTGGAAACAG AGCATCATCA 
TGCTGTGAAG CAGCTCAAAC ACATGAGTTT TCTTCCTATA TTGATACAGG 
TCGAGTGTAG ACCAGGGAAT CTGGGCGATC TTTAAAGTTC GAACAACCGA 
ACGTTAGGAA ATCTTGAATT GGTAGAGGTC GGACCGTTAT CGGGTGAATC 
GAACAAAGGG ATAATACAAA ATGGAGTGCA GAGCTAGGAA GAAAGC6TGC 
CX3CGTGTATC AAGATGCCAA ACAATCCATC AATCATTTAT TTGTGGATTA 
CAATTAAATC CAGAAATAGG GATGGCAGAT ATTATGGACG CTCAAAATCT 
ATTTCAGATG TATATAGCX3A TGCCGTACTG CAAATCCCTG GAATTAACTA 
ACAGAGCTGT CCAATCGCTT ACAACAAGCA TCGTATCTGT ATACGTCTCG 
CAAAATGGGG ACTTTAACAA CGGGCTAGAT AGCTGGAATG CAACAGCGGG 
CAACAGGATG GCAATACGCA TTTCTTAGTT CTTTCTCATT GGGATGCACA 
CAATTTAGAG TGCAGCCGAA TTGTAAATAT GTATTACGTG TAACAGCAGA 
GGCGGAGACG GATACGTGAC TATCCX3GGAT GGTGCTCATC ATACAGAAAC 
AATGCATGTG ATTATGATAT AAATGGCACG TACGTGACTG ATAATACGTA 
GAAGTGATAT TCTATTCACA TACAGAACAC ATGTGGGTAG AGGTAAATGA 



CGGAAGAACA 
TACTACTGAT 
TAGCGCTGGT 
AGAGGCGAAA 
AAGGQACGGA 
GTCATGCTTA 
CGCGGCAAAA 
CAATAGTACA 
TCCATTCTAT 
CATTTATCAA 
GTTCGTGAAG 
TGTGAAAAAT 
AATCAATCGA 
TCCGATGGAT 
GGATTTATUVT 
TGGTTATGCG 
TTTAGAACGT 
AGAAACAGAT 
TCAAGATCAA 
TGTCGCATCA 
TGAGATTTAC 
AAATGCGGTG 
TGCATCGGTA 
AGTTTCTCAA 
GAAAGTAGGC 
GCTTACATTT 
TCTAACAAAA 
AACAGAAGGT 



1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 
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3411 



(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 789 amino acidS 

(B) TYPE: amino acid 

(C) STRANBEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

Ixi) SEQUENCE DESCRIPTION :"5EQ ID NO: 78: 

Met Asn Lys Asn Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe 
1 5 10 15 

He Asp Tyr Phe Asn Gly He Tyr Gly Phe Ala Thr Gly He Lys Asp 
20 25 30 

He Met Asn Met He Phe Lys Thr Asp Thr Gly Gly Asp Leu Thr Leu 
35 40 45 

Asp Glu He Leu Lys Asn Gin Gin Leu Leu Asn Asp He Ser Gly Lys 
50 55 60 

Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu He Ala Gin Gly Asn 
65 70 75 80 

Leu Asn Thr Glu Leu Ser Lys Glu He Leu Lys He Ala Asn Glu Gin 
85 90 95 

Asn Gin val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala He Asn Thr 
100 105 110 

Met Leu Arg Val Tyr Leu Pro Lys He Thr Ser Met Leu Ser Asp Val 
115 120 125 

Met Lys Gin Asn Tyr Ala Leu Ser Leu Gin He Glu Tyr Leu Ser Lys 
130 135 140 

Gin Leu Gin Glu He Ser Asp Lys Leu Asp He He Asn Val Asn Val 
145 150 155 160 

Leu He Asn Ser Thr Leu Thr Glu He Thr Pro Ala Tyr Gin Arg He 
165 170 175 

Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Ala Thr Glu Thr 
180 185 190 

Ser Ser Lys Val Lys Lys Asp Gly Ser Pro Ala Asp He Leu Asp Glu 
195 200 205 
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Leu Thr Glu Leu Thr Glu Leu Ala Lys Ser Val Thr Lys Asn Asp Val 
210 215 220 

Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 
225 230 235 240 

Asn Asn Leu Phe Gly Arg Ser Ala Leu Lys Thr Ala Ser Glu Leu lie 
245 250 255 

Thr Lys Glu Asn Val Lys Ala Ser Gly Ser Glu Val Gly Asn Val Tyr 
260 265 270 

Asn Phe Leu lie Val Leu Thr^Ala Leu Gin Ala Lys Ala Phe Leu Thr — 
275 280 ^ 285 

Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Ala Asp lie Asp Tyr Thr 
290 295 300 

Ser lie Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 
305 310 315 320 

Asn lie Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 
325 330 335 

Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met lie Val Glu Ala Lys 
340 345 350 

Pro Gly His Ala Leu lie Gly Phe Glu lie Ser Asn Asp Ser lie Thr 
355 360 365 

Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gin Asn Tyr Gin Val Asp 
370 375 380 

Lys Asp Ser Leu Ser Glu Val lie Tyr Gly Asp Met Asp Lys Leu Leu 
385 390 395 400 

Cys Pro Asp Gin Ser Glu Gin He Tyr Tyr Thr Asn Asn He Val Phe 
405 410 415 

Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 
420 425 430 

Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 
435 440 445 

Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 
450 455 460 

Arg Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 
465 470 475 480 



He Ser Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gin Ala 
485 490 495 
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Asp Glu Asn Ser Arg Leu lie Thr Leu Thr Cys Lys Ser Tyr Leu Arg 
500 505 510 

Glu Leu Leu Leu Ala Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 
515 520 525 

Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu Asn Gly Ser He 
530 535 540 

Glu Glu Asp Asn Leu Glu Pro Trp Lys Ala Asn Asn Lys Asn Ala Tyr 
545 550 555 560 

Val Asp His Thr Gly Gly-Val~Asn Gly Thr Lys Ala Leu Tyr Val His — 
-565 570_ 575 

Lys Asp Gly Gly He Ser Gin Phe He Gly Asp Lys Leu Lys Pro Lys 
580 585 590 

Thr Glu Tyr Val He Gin Tyr Thr Val Lys Gly Lys Pro Ser He His 
595 600 605 

Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 
610 615 620 

Asn Leu Glu Asp Tyr Gin Thr He Asn Lys Arg Phe Thr Thr Gly Thr 
625 630 635 640 

Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gin Asn Gly Asp Glu 
645 650 655 

Ala Trp Gly Asp Asn Phe He He Leu Glu He Ser Pro Ser Glu Lys 
660 665 670 

Leu Leu Ser Pro Glu Leu He Asn Thr Asn Asn Trp Thr Ser Thr Gly 
675 680 685 

Ser Thr Asn He Ser Gly Asn Thr Leu Thr Leu Tyr Gin Gly Gly Arg 
690 695 700 

Gly He Leu Lys Gin Asn Leu Gin Leu Asp Ser Phe Ser Thr Tyr Arg 
705 710 715 720 

Val Tyr Phe Ser Val Ser Gly Asp Ala Asn Val Arg He Arg Asn Ser 
. 725 730 735 

Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Asp Val 
740 745 750 

Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu 
755 760 765 

Leu Ser Gin Gly Asn Asn Leu Tyr Gly Gly Pro He Val His Phe Tyr 
770 775 780 
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Asp Val Ser He Lys 
785 

(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2370 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECXICETYPE: DNA (genomic) ' 

(xi) SEQUENCE DESCRIPTION: 'sEQ ID NO:79: 

ATGAACAAGA ATAATACTAA ATTAAGCACA AGAGCCTTAC CAAGTTTTAT TGATTATTTT 60 

AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAACATGAT TTTTAAAACG 120 

GATACAGGTG GTGATCTAAC CCTAGACGAA ATTTTAAAGA ATCAGCAGTT ACTAAATGAT 180 

ATTTCTGGTA AATTGGATGG GGTGAAT6GA AGCTTAAATG ATCTTATCGC ACAGGGAAAC 240 

TTAAATACAG AATTATCTAA GGAAATATTA AAAATTGCAA ATGAACAAAA TCAAGTTTTA 300 

AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCGGGTATA TCTACCTAAA 360 

ATTACCTCTA TGTTGAGTGA TGTAATGAAA CAAAATTATG CGCTAAGTCT GCAAATAGAA 420 

TACTTAAGTA AACAATTGCA AGAGATTTCT GATAAGTTGG ATATTATTAA TGTAAATGTA 480 

CTTATTAACT CTACACTTAC TGAAATTACA CCTGCGTATC AAAGGATTAA ATATGTGAAC 540 

GAAAAATTTG AGGAATTAAC TTTTGCTACA GAAACTAGTT CAAAAGTAAA AAAGGATGGC 600 

TCTCCTGCAG ATATTCTTGA TGAGTTAACT GAGTTAACTG AACTAGCGAA AAGTGTAACA 660 

AAAAATGATG TGGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 720 

AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCATCGG AATTAATTAC TAAAGAAAAT 780 

GTGAAAGCAA GTGGCAGTGA GGTOGGAAAT GTTTATAACT TCTTAATTGT ATTAACAGCT 840 

CTGCAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 900 

ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 960 

AACATCCTCC CTACACTTTC . TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 1020 

AGTGATGAAG ATGCAAAGAT GATTQTGGAA GCTAAACCAG GACATGCATT QATTGGGTTT 1080 

GAAATTAGTA ATGATTCAAT TACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 1140 

TATCAAGTCG ATAAGGATTC CTTATCGGAA GTTATTTATG GTGATATGOA TAAATTATTG 1200 
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TGCCCAGATC AATCTGAACA AATCTATTAT ACAAATTUICA TAGTATTTCC AAATGAATAT 
GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 
AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 
GAAGCQQAGT ATAGAACGTT AAGTGCTAAT GATGATGGGG TGTATATGCC GTTAGGTGTC 
ATCAGT6AAA CATTTTTGAC TCCGATTAAT GGGTTTGGCC TCCAAGCTGA TGAAAATTCA 
AGATTAATTA CTTTAACATG TAAATCATAT TTAAGA6AAC TACTGCTAGC AACAGACTTA 
AGCAATAAAG AAACTRAATT GATTGTCCCG CCAAGTGGTT TTATTAGCAA TATTGTAGAG 
AACGGGTCCA TAGAAGAGGA CAATTTAGM CCGTGGAAAG CAAATAATAA GAATGCGTAT 
GTAGATCATA CAGGCGGAGT GAATGGAACT AAAGCTTTAT ATGTTCATAA GGACGGAGGA 
ATTTCACAAT TTATTGGAGA TAAGTTAAAA CCGAAAACTG AGTATGTAAT CCAATATACT 
GTTAAAGGAA AACCTTCTAT TCATTTAAAA GATGAAAATA CTGGATATAT TCATTATGAA 
GATACAAATA ATAATTTAGA AGATTATCAA ACTATTAATA AACGTTTTAC TACAGGAACT 
GATTTAAAGG GAGTGTATTT AATTTTAAAA AGTCAAAATG GAGATGAAGC TTGGGGAGAT 
AACTTTATTA TTTTGGAAAT TAGTCCTTCT GAAAAGTTAT TAAGTCCAGA ATTAATTAAT 
ACAAATAATT GGACGAGTAC GGGATCAACT AATATTAGCG GTAATACACT CACTCTTTAT 
CAGGGAGGAC GAGGGATTCT AAAACAAAAC CTTCAATTAG ATAGTTTTTC AACTTATAGA 
GTGTATTTTT CTGTGTCCGG AGATGCTAAT GTAAGGATTA GAAATTCTAG GGAAGTGTTA 
TTTGAAAAAA GATATATGAG CGGTGCTAAA GATGTTTCTG AAATGTTCAC TACAAAATTT 
GAGAAAGATA ACTTTTATAT AGAGCTTTCT CAAGGGAATA ATTTATATGG TGGTCCTATT 
GTACATTTTT ACGATGTCTC TATTAAGTAA 



1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2370 



(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 789 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

Met Asn Lys Asp Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe 
15 10 15 
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He Asp Tyr Phe Asn Gly He Tyr Gly Phe Ala Thr Gly He Lys Asp 
20 25 30 

He Met Asn Met He Phe Lys Thr Asp Thr Gly Gly Asp Leu Thr Leu 
35 40 45 

Asp Glu He Leu Lys Asn Gin Gin Leu Leu Asn Asp He Ser Gly Lys 
50 55 60 

Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu He Ala Gin Gly Asn 
65 70 75 80 

Leu Asn Thr Glu Leu Ser Lys Glu- He Leu Lys He Ala Asn Glu Gin 
^ 85 90 95 

Asn Gin Val Leu Asn Glu Val Asn Asn Lys Leu Glu Ala He Ser Thr 
100 105 110 

He Phe Arg Val Tyr Leu Pro Lys Asn Thr Ser Arg Gly Gly Gly Val 
115 120 125 

Met Lys Gin Asn Tyr Ala Leu Ser Leu Gin Met Glu Asn Leu Ser Lys 
130 135 140 

Gin Leu Gin Glu He Ser Val Lys Trp Asp He He Asn Val Asn Val 
145 150 155 160 

Leu He Asn Ser Thr Leu Thr Glu He Thr Pro Ala Tyr Gin Arg He 
165 170 175 

Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Ala Thr Glu Thr 
180 185 190 

Ser Ser Lys Val Lys Lys Asp Gly Ser Pro Ala Asp He Leu Asp Glu 
195 200 205 

Leu Thr Glu Leu Thr Glu Leu Ala Lys Ser Val Thr Lys Asn Asp Val 
210 215 220 

Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 
225 230 235 240 

Asn Asn Leu Phe Gly Arg Ser Ala Leu Lys Thr Ala Ser Glu Leu He 
245 250 255 

Thr Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 
260 265 270 

T^n Phe Leu He Val Leu Thr Ala Leu Gin Ala Lys Ala Phe Leu Thr 
275 280 285 



Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Ala Asp He Asp Tyr Thr 
290 295 300 
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Ser lie Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 
305 310 315 320 

Asn lie Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 
325 330 335 

Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met He Val Glu Ala Lys 
340 345 350 

Pro Gly His Ala Leu He Gly Phe Glu He Ser Asn Asp Ser He Thr 
355 360 365 

Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gin Asn Tyr Gln_Val-Asp 
370 ' 375 "^" 380 

Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Met Asp Lys Leu Leu 
385 390 395 400 

Cys Pro Asp Gin Ser Glu Gin He Tyr Tyr Thr Asn Asn He Val Phe 
405 410 415 

Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 
420 425 430 

Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 
435 440 445 

Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 
450 455 460 

Arg Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 
465 470 475 480 

He Ser Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gin Ala 
485 490 495 

Asp Glu Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 
500 505 510 

Glu Leu Leu Leu Ala Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 
515 520 525 

Val Pro Pro Ser Gly Phe He Ser Xaa He Val Glu Asn Gly Ser He 
530 535 540 

Glu Glu Asp Asn Leu Glu Pro Trp Lys Ala Asn Asn Lys Asn Ala Tyr 
545 550 555 560 

Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 
565 570 575 



Lys Asp Gly Gly He Ser Gin Phe He Gly Asp Lys Leu Lys Pro Lys 
580 585 590 
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Thr Glu Tyx Val He Gin Tyr Thr Val Lys Gly Lys Pro Ser He His 
595 600 605 

Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 
610 615 620 

Asn Leu Glu Asp Tyr Gin Thr He Asn Lys Arg Phe Thr Thr Gly Thr 
625 630 635 640 

Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gin Asn Gly Asp Glu 
645 650 655 

Ala Trp Gly ASp Asn Phe He He-Leu Glu He Ser Pro Ser Glu Lys" 
— — 660 " 665 ffTTT 

Leu Leu Ser Pro Glu Leu He Asn Thr Asn Asn Trp Thr Ser Thr Gly 
675 680 685 

Ser Thr Asn He Ser Gly Asn Thr Leu Thr Leu Tyr Gin Gly Gly Arg 
690 695 700 

Gly He Leu Lys Gin Asn Leu Gin Leu Asp Ser Phe Ser Thr Tyr Arg 
705 710 715 720 

Val Tyr Phe Ser Val Ser Gly Asp Ala Asn Val Arg He Arg Asn Ser 
725 730 735 

Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Asp Val 
740 745 750 

Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu 
755 760 765 

Leu Ser Gin Gly Asn Asn Leu Tyr Gly Gly Pro He Val His Phe Tyr 
770 775 780 



Asp Val Ser He Lys 
785 



(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQOENCE CHARACTERISTICS: 

(A) LENGTH: 2375 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 
ATGAACAAGG ATAATACTAA ATTAAGCACA AGAGCCTTAC CAAGTTTTAT TGATTATTTT 



AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAACATGAT TTTTAAAACG 



120 
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GATACAGGTG GTGATCTAAC CCTAGACGAA ATTTTAAAGA ATCAGCAGTT ACTAAATGAT 180 

ATTTCTGGTA AATTGGATGG GGTGAATGGA AGCTTAAATG ATCTTATCGC ACAGGGAAAC 240 

TTAAATACAG AATTATCTAA GGAAATATTA AAAATTGCAA ATGAACAAAA TCAAGTTTTA 300 

AATGAGGTTA ATAACAAACT CGAGGCGATA AGTACGATTT TTCGGGTATA TTTACCTAAA 360 

AATACCTCTA GGGGGGGGGG GGTAATGAAA CAAAATTATG CGCTAAGTCT GCAAATGGAA 420 

AACTTGAGTA AACAATTACA AGAGATTTCT GTTAAGTGGG ATATTATTAA TGTAAATGTA 480 
CTTATTAACT CTACACTTAC.JC6AAATTACA CCTGCGTATC AAAGGATTAA 'STATGTGAAC 540 
GAAAAATTTG AGGAATTAAC TTTTGCTACA GAAACTAGTT CAAAAGTAAA AAAGGATGGC 600 
TCTCCCGCAG ATATTCTTGA TGAGTTAACT GAGTTAACTG AACTAGCGAA AAGTGTAACA 660 
AAAAATGATG TGGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 720 
AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCATCGG AATTAATTAC TAAAGAAAAT 780 
GTGAAAACAA GTGGCAGTGA GGTCGGAAAT GTTTATAACT TCTTAATTGT ATTAACAGCT 840 
CTGCAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 900 
ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 960 

AACATCCTCC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 1020 

AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG QACATGCATT GATTGGGTTT 1080 

GAAATTAGTA ATGATTCAAT TACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 1140 

TATCAAGTCG ATAAGGATTC CTTATCGGAA GTTATTTATG GTGATATGGA TAAATTATTG 1200 

TGCCCAGATC AATCTGAACA AATCTATTAT ACAAATAACA TA6TATTTCC AAATGAATAT 1260 

GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 1320 

pj^'TTITTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 1380 

QAAGCGQAGT ATAGAACGTT AAGTGCTAAT GATGATGGGG TGTATATGCC GTTAGGTGTC 1440 

ATCAGTGAAA CATTTTTGAC TCCGATTAAT GGGTTTGGCC TCCAAGCTGA TGAAAATTCA 1500 

AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAGAAC TACTGCTAGC AACCGACTTA 1560 

AGCAATAAAG AAACTAAATT GATCGTCCCG CCAAGTGGTT TTATTAGCSA TATTGTAGAG 1620 

AACGGGTCCA TAGAAGAGGA CAATTTAGAG CCGTGGAAAG CAAATAATAA GAATGCGTAT 1680 

GTAGATCATA CAGGCGGAGT GAATGGAACT AAAGCTTTAT ATGTTCATAA GGACGGAGGA 1740 

ATTTCACAAT TTATTGGAGA TAAGTTAAAA CCGAAAACTG AGTATGTAAT CCAATATACT 1800 
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GTTAAAGGAA AACCTTCTAT TCATTTAAAA GATGAAAATA CTGGATATAT TCATTATGAA 
GATACAAATA ATAATTTAGA AGATTATCAA ACTATTAATA AACGTTTTAC TACAGGAACT 
GATTTAAAGG GAGTGTATTT AATTTTAAAA AGTCAAAATG GAGATGAAGC TTGGGGAGAT 
AACTTTATTA TTTTGGAAAT TAGTCCTTCT GAAAAGTTAT TAAGTCCAGA ATTAATTAAT 
ACAAATAATT GGACXSAGTAC GGGATCAACT AATATTAGCG GTAATACACT CACTCTTTAT 
CAGGGAGGAC GAGGGATTCT AAAACAAAAC CTTCAATTAG ATAGTTTTTC AACTTATAGA 
GTGTATTTTT CTGTGTCCGG.,AGATGCTAAT GTAAGGATTA GAAATTCTAG GGAAGTGTTA 
TTTGAAAAAA QATATATGAG CGGTGCTAAA GATGTTTCTG AAATGTTCAC TACAAAATTT 
GAGAAAGATA ACTTTTATAT AGAGCTTTCT CAAGGGAATA ATTTATATGG TGGTCCTATT 
GTTCATTTTT ACGATGTCTC TATTAAGTAA CCCAA 



1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2375 



(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 789 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NOi 

Met Asn Lys Asn Asn Thr Lys Leu Ser 
1 5 



:82: 

Thr Arg Ala Leu 
10 



He Asp Tyr Phe Asn Gly He Tyr Gly Phe Ala Thr Gly 
20 25 



He Met Asn Met He Phe Lys Thr Asp 

35 ^40 

Asp Glu He Leu Lys Asn Gin Gin Leu 
50 55 

Leu Asp Gly Val Asn Gly Ser Leu Asn 
65 70 

Leu Asn Thr Glu Leu Ser Lys Glu He 
85 



Thr Gly Gly Asn 
45 

Leu Asn Glu He 
60 

Asp Leu He Ala 
75 

Leu Lys He Ala 
90 



Pro Ser Phe 
15 

He Lys Asp 
30 

Leu Thr Leu 



Ser Gly Lys 



Gin Gly Asn 
80 

Asn Glu Gin 
95 



Asn Gin Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala He Asn Thr 
100 105 110 
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Met Leu His He Tyr Leu Pro Lys He Thr Ser Met Leu Ser Asp Val 
115 120 125 

Met Lys Gin Asn Tyr Ala Leu Ser Leu Gin He Glu Tyr Leu Ser Lys 
130 135 140 

Gin Leu Gin Glu He Ser Asp Lys Leu Asp He He Asn Val Asn Val 
145 150 155 160 

Leu He Asn Ser Thr Leu Thr Glu He Thr Pro Ala Tyr Gin Arg He 
165 170 175 

Lys Tyr-Val Asn Glu Lys Phe Glu Glu Leu'Tlir Phe Ala Thr Glu Thr 
" 180 185 190, 

Thr Leu Lys Val Lys Lys Asp Ser Ser Pro Ala Asp He Leu Asp Glu 
195 200 205 

Leu Thr Glu Leu Thr Glu Leu Ala Lys Ser Val Thr Lys Asn Asp Val 
210 215 220 

Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 
225 230 235 240 

Asn Asn Leu Phe Gly Arg Ser Ala Leu Lys Thr Ala Ser Glu Leu He 
245 250 255 

Ala Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 
260 265 270 

Asn Phe Leu He Val Leu Thr Ala Leu Gin Ala Lys Ala Phe Leu Thr 
275 280 285 

Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Ala Asp He Asp Tyr Thr 
290 295 300 

Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 
305 310 315 320 

Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 
325 330 335 

Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met He Val Glu Ala Lys 
340 345 350 

Pro Gly Tyr Ala Leu Val Gly Phe Glu Met Ser Asn Asp Ser He Thr 
355 360 365 

Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gin Asn Tyr Gin Val Asp 
370 375 380 



Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Thr Asp Lys Leu Leu 
385 390 395 400 
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Cys Pro Asp Gin Ser Glu Gin He Tyr Tyr Thr Asn Asn He Val Phe 
405 410 415 

Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 
420 425 430 

Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 
435 440 445 

Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 
450 455 460 

Arg Thr Leu Ser-Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 
465 470 475 480 — 

He Ser Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gin Ala 
485 490 495 

Asp Gly Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 
500 505 510 

Glu Leu Leu Leu Ala Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 
515 520 525 

Val Leu Pro Ser Gly Phe He Ser Asn He Val Glu Asn Gly Ser He 
530 535 540 

Glu Glu Asp Asn Leu Glu Pro Trp Lys Ala Asn Asn Lys Asn Ala Tyr 
545 550 555 560 

Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 
565 570 575 

Lys Asp Gly Gly Phe Ser Gin Phe He Gly Asp Lys Leu Lys Pro Lys 
580 585 590 

Thr* Glu Tyr Val He Gin Tyr Thr Val Lys Gly Lys Pro Ser He His 
595 600 605 

Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn , 
610 615 620 

Asn Leu Lys Asp Tyr Gin Thr He Thr Lys Arg Phe Thr Thr Gly Thr 
625 630 635 640 

Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gin Asn Gly Asp Glu 
645 650 655 

Ala Trp Gly Asp Asn Phe He He Leu Glu He Ser Pro Ser Glu Lys 
660 665 670 



Leu Leu Ser Pro Glu Leu He Asn Thr Asn Asn Trp Thr Ser Thr Gly 
675 680 685 
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Ser Thr His He Ser Gly Asn Thr Leu Thr Leu Tyr Gin Gly Gly Arg 
690 695 700 

Gly He Leu Lys Gin Asn Leu Gin Leu Asp Ser Phe Ser Thr Tyr Arg 
705 710 715 720 

Val Tyr Phe Ser Val Ser Gly Asp Ala Asn Val Arg He Arg Asn Ser 
725 730 735 

Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Asp Val 
740 745 750 

— Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr Ile^Glu 
755 760 -765 

Leu Ser Gin Gly Asn Asn Leu Tyr Gly Gly Pro He Val His Phe Asn 
770 775 780 

Asp Val Ser He Lys 
785 



(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2375 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 

ATGAACAAGA ATAATACTAA ATTAAGCACA AGAGCCTTAC CAAGTTTTAT TGATTATTTT 60 

AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGMTATGAT TTTTATU^CG 120 

GATACAGGTG GTAATCTAAC CTTAGATGAA ATCCTAAAGA ATCAGCAGTT ACTAAATGAG 180 

ATTTCTGGTA AATTGGATGG GGTAAATGGG AGCTTAAATG ATCTTATCGC ACAGGGAAAC 240 

TTAAATACAG AATTATCTAA G6AAATCTTA AAAATT6CAA ATGAACAGAA TCAAGTCTTA 300 

AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCATATATA TCTACCTAAA 360 

ATTACATCTA TGTTAAGTGA TGTAATGAAG C7VAAATTATG CGCTAAGTCT GCAAATAGAA 420 

TACTTAAGTA AACAATTGCA AGAAATTTCT GATAAATTAG ATATTATTAA CGTAAATGTT 480 

CTTATTAACT CTACACTTAC TGAAATTACA CCTGCATATC AACGGATTAA ATATGTGAAT 540 

GAAAAATTTG AAGAATTAAC TTTTGCTACA GAAACCACTT TAAAAGTAAA AAAGGATAGC 600 

TCGCCTGCTG ATATTCTTGA TGAGTTAACT GAATTAACTG AACTAGCGAA AAGTGTTACA 660 
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AAAAATGACG TGGATGGTTT TGAATTTTAC 
AATAATTTAT TCGGGCGTTC AGCTTTAAAA 
GTGAAAACAA GTGGCAGTGA AGTAGGAAAT 
CTACAAGCAA AAGCTTTTCT TACTTTAACA 
ATTGATTATA CTTCTATTAT 6AATGAACAT 
AACATCCTTC CTACACTTTC TAATACTTTT 
AGTGATGAAG ATGCAAAGAT GATTGTGGAA 
GAAATGAGCA ATGATTCAAT CACAGTATTA 
TATCAAGTTG ATAAGGATTC CTTATCGGAA 
TGTCCAGATC AATCTGAACA AATATATTAT 
GTAATTACTA AAATTGATTT CACTAAAAAA 
AATTTTTATG ATTCTTCTAC AGGAGAAATT 
GAAGCGGAGT ATAGAACGTT AAGTGCTAAT 
ATCAGTGAAA CATTTTTGAC TCCGATAAAT 
AGATTAATTA CTTTAACATG TAAATCATAT 
AGCAATAAAG AAACTAAATT GATCGTCCTG 
AACGGGTCCA TAGAAGAGGA CAATTTAGAG 
GTAGATCATA CAGGCGGAGT GAATGGAACT 
TTTTCACAAT TTATTGGAGA TAAGTTAAAA 
GTTAAAGGAA AACCTTCTAT TCATTTAAAA 
GATACAAATA ATAATTTAAA AGATTATCAA 
GATTTAAAGG GAGTGTATTT AATTTTAAAA 
AACTTTATTA TTTTGGAAAT TAGTCCTTCT 
ACAAATAATT GGACGAGTAC GGGATCAACT 
CAGGGAGGAC GAGGAATTCT AAAACAAAAC 
GTGTATTTTT CTGTGTCCGG AGATGCTAAT 
TTTGAAAAAA GATATATGAG CGGTGCTAAA 
GAGAAAGATA ACTTTTATAT AGAGCTTTCT 
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CTTAATACAT TCCACGATGT AATGGTAGGA 720 

ACTGCTTCAG T^TTAATTGC TAAAGAAAAT 780 

GTTTATAACT TCTTAATTGT ATTAACAGCT 840 

ACATGCCGAA AATTATTAGG CTTAGCAGAT 900 

TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 960 

TCTAATCCTA ATTATGCAAA AGTTAAAGGA 1020 

GCTAAACCAG GATATGCATT GGTTGGGTTT 1080 

AAAGTATATG AGGCTAAGCT AAAACAAAAT 114 0 

GTTATTTATG GTGATACGGA TAAATTATTG 1200 

ACAAATAACA TAGTATTTCC AAATGAATAT 1260 

ATGAAAACTT TAAGATATGA GGTAACAGCG 1320 

GACTTAAATA AGAAAAAAGT A6AATCAAGT 1380 

GATGATGGAG TGTATATGCC ATTAGGTGTC 1440 

GGGTTTGGCC TCCAAGCTGA TGGAAATTCA 1500 

TTAAGAGAAC TACT6CTAGC AACAGACTTA 1560 

CCAAGTGGTT TTATTAGCAA TATTGTAGAG 1620 

CCGTGGAAAG CAAATAATAA GAATGCGTAT 1680 

AAAGCTTTAT ATGTTCATAA GGACGGAGGA 1740 

CCGAAAACTG AGTATGTAAT CCAATATACT 1800 

GATGAAAATA CTGGATATAT TCATTATGAA . 1860 

ACTATTACTA AACGTTTTAC TACAGGAACT 1920 

AGTCAAAATG GAGATGAAGC TTGGGGAGAT 1980 

GAAAAGTTAT TAAGTCCAGA ATTAATTAAT 2040 

CATATTAGCG GTAATACACT CACTCTTTAT 2100 

CTTCAATTAG ATAGTTTTTC AACTTATAGA 2160 

GTAAGGATTA GAAATTCTAG GGAAGTGTTA 2220 

GATGTTTCTG AAATGTTCAC TACAAAATTT 2280 

CAAGGGAATA ATTTATATGG TGGTCCTATT 2340 
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GTACATTTTA ACGATGTCTC TATTAAGTAA CCCAA 2375 



(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 789 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 



Met Asn Lys Asn Asn Thr Lys Leu Ser Ala Arg Ala Leu Pro Ser Phe 
15 10 15 

lie Asp Tyr Phe Asn Gly lie Tyr Gly Phe Ala Thr Gly lie Lys Asp 
20 25 30 

lie Met Asn Met He Phe Lys Thr Asp Thr Gly Gly Asn Leu Thr Leu 
35 40 45 

Asp Glu He Leu Lys Asn Gin Gin Leu Leu Asn Glu He Ser Gly Lys 
50 55 60 

Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu He Ala Gin Gly Asn 
65 70 75 80 

Leu Asn Thr Glu Leu Ser Lys Glu He Leu Lys He Ala Asn Glu Gin 
85 90 95 

Asn Gin Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala He Asn Thr 
100 105 110 

Met Leu His He Tyr Leu Pro Lys He Thr Ser Met Leu Ser Asp Val 
115 120 125 

Met Lys Gin Asn Tyr Ala Leu Ser Leu Gin He Glu Tyr Leu Ser Lys 
130 135 140 

Gin Leu Gin Glu He Ser Asp Lys Leu Asp He He Asn Val Asn Val 
145 150 155 160 

Leu He Asn Ser Thr Leu Thr Glu He Thr Pro Ala Tyr Gin Arg He 
165 170 175 

Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Ala Thr Glu Thr 
180 185 190 



Ser Ser Lys Val Lys Lys Asp Ser Pro Pro Ala Asp He Leu Asp Glu 
195 200 205 
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Leu Thr Glu Leu Thr Glu Leu Ala Lys Ser Val Thr Lys Asn Asp Val 
210 215 220 

Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 
225 230 235 240 

Asn Asn Leu Phe Gly Arg Ser Ala Leu Lys Thr Ala Ser Glu Leu He 
245 250 255 

Ala Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 
260 265 270 

Asn Phe Leu He Val Leu Thr Ala Leu Gin Ala Lys Ala Phe Leu Thr 
275 280 285 

Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Ala Asp He Asp Tyr Thr 
290 295 300 

Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 
305 310 315 320 

Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 
325 330 335 

Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met He Val Glu Ala Lys 
340 345 350 

Pro Gly Tyr Ala Leu Val Gly Phe Glu Met Ser Asn Asp Ser He Thr 

355 360 365 

Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gin Asn Tyr Gin Val Asp 
370 375 380 

Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Thr Asp Lys Leu Leu 
385 390 395 400 

Cys Pro Asp Gin Ser Glu Gin He Tyr Tyr Thr Asn Asn He Val Phe 
405 410 415 

Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 
420 425 430 

Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 
435 440 445 

Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 
450 455 460 

Arg Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 
465 470 475 480 



He Ser Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gin Ala 
485 490 495 
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Asp Gly Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 
500 505 510 

Glu Leu Leu Leu Ala Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 
515 520 525 

Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu Asn Gly Ser He 
530 535 540 

Glu Glu Asp Asn Leu Glu Pro Trp Lys Ala Asn Asn Lys Asn Ala Tyr 
545 550 555 560 

Val Asp_Hi-6-Jrhr Gly Gly Val Asn Gly Thr Lys XIa" Leu Tyr Val His 
565 570 575_- 

Lys Asp Gly Gly Phe Ser Gin Phe He Gly Asp Lys Leu Lys Pro Lys 
580 585 590 

Thr Glu Tyr Val He Gin Tyr Thr Val Lys Gly Lys Pro Ser He His 
595 600 605 

Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 
610 615 620 

Asn Leu Lys Asp Tyr Gin Thr He Thr Lys Arg Phe Thr Thr Gly Thr 
625 630 635 640 

Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gin Asn Gly Asp Glu 
645 650 655 

Ala Trp Gly Asp Asn Phe He He Leu Glu He Ser Pro Ser Glu Lys 
660 665 670 

Leu Leu Ser Pro Glu Leu He Asn Thr Asn Asn Trp Thr Ser Thr Gly 
675 680 685 

Ser Thr His He Ser Gly Asn Thr Leu Thr Leu Tyr Gin Gly Gly Arg 
690 695 700 

Gly He Leu Lys Gin Asn Leu Gin Leu Asp Ser Phe Ser Thr Tyr Arg 
705 710 715 720 

Val Tyr Phe Ser Val Ser Gly Asp Ala Asn Val Arg He Arg Asn Ser 
725 730 735 

Arg Glu Val Leu Phe Glu Lys Gly Tyr Met Ser Gly Ala Lys Asp Val 
740 745 750 

Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu 
755 760 765 



Leu Ser Gin Gly Asn Asn Leu Tyr Gly Gly Pro He Val His Phe Tyr 
770 775 780 
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Asp Val Ser He Lys 
785 
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(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2375 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(-L t-) MOLECULE TYPE: DNA (genomic) ^ 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 

ATGAACAAGA ATAATACTAA ATTAAGCGCA AGGGCCCTAC CGAGTTTTAT TGATTATTTT 60 

AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAATATGAT TTTTAAAACG 120 

GATACAGGTG GTAATCTAAC CTTAGATGAA ATCCTAAAGA ATCAGCAGTT ACTAAATGAG 180 

ATTTCTX3GTA AATTGGATGG GGTAAATGGG AGCTTAAATG ATCTTATCGC ACAGGGAAAC 240 

TTAAATACAG AATTATCTAA GGAi\ATCTTA AAAATTGCAA ATGAACAGAA TCAAGTCTTA 300 

AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCATATATA TCTACCTAAA 360 

ATTACATCTA TGTTAAGTGA TGTAATGAAA CAAAATTATG CGCTAAGTCT GCAAATAGAA 420 

TACTTAAGTA AACAATTGCA AGAAATTTCT GATAAATTAG ATATTATTAA CGTAAATGTC 480 

CTTATTAACT CTACACTTAC TGAAATTACA CCTGCATATC AACGGATTAA ATATGTGAAT 540 

GAAAAATTTG AAGAATTAAC TTTTGCTACA GAAACTA6TT CAAAAGTAAA AAAGGATAGC 600 

CCCCCTGCTG ATATTCTTGA TGAGTTAACT GAATTAACTG AACTAGCGAA AAGTGTAACA 660 

AAAAATGACG TGGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 720 

AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTQCTTCAG AATTAATTGC TAAAGAAAAT 780 

GTGAAAACAA GTGGCAGTGA AGTAGGAAAT GTTTATAATT TCTTAATTGT ATTAACAGCT 840 

CTACAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 900 

ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 960 

AACATCCTTC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 1020 

AGTGATQAA6 ATGCAAA6AT GATTGTGGAA GCTAAACCAG GATATGCATT GGTTGGTTTT 1080 

GAAATGAGCA ATGATTCAAT CACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 1140 

TATCAAGTTG ATAAGGATTC CTTATCGGAG GTTATTTATG GTGATACGGA TAAATTATTG 1200 
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Si 

TGTCCAGATC AATCTGAACA AATATATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 1260 

GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 1320 

AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 1380 

GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGAG TGTATATGCC ATTAGGTGTC 1440 

ATCAGTGAAA CATTTTTGAC TCCGATAAAT GGGTTTGGCC TCCAAGCTGA TGGAAATTCA 1500 

AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAGAAC TACTGCTAGC AACAGACTTA 1560 

AGCAA^EAAAG AAACTAAATT GATCGTCCCG CCAAGTQGTT TTATTAGCAA TATTGTAGAG 1620 

AACGGGTCCA TAGAAGAGGA CAATTTAGAG CCGTGGAAAG CAAATAATAA GAATGCGTAT 1680 

GTAGATCATA CAGGCGGAGT GAATGGAACT AAAGCTTTAT ATGTTCATAA GGACGGAGGA 1740 

TTTTCACAAT TTATTGGAGA TAAGTTAAAA CCGTU^CTG AGTATGTAAT CCAATATACT 1800 

GTTAAAGGAA AACCTTCTAT TCATTTAAAA GATGAAAATA CTGGATATAT TCATTATGAA 1860 

GATACAAATA ATAATTTAAA AQATTATCAA ACTATTACTA AACGTTTTAC TACAGGAACT 1920 

GATTTAAAGG GAGTGTATTT AATTTTAAAA AGTCAAAATG GAGATQAAGC TTGGGGAGAT 1980 

AACTTTATTA TTTTGGAAAT TAGTCCTTCT GAAAAGTTAT TAAGTCCAGA ATTAATTAAT 2040 

ACAAATAATT GGACGAGTAC GGGATCAACT CATATTAGCG GTAATACACT CACTCTTTAT 2100 

CAGGGAGGAC GAGGAATTCT AAAACAAAAC CTTCAATTAG ATAGTTTTTC AACTTATAGA 2160 

GTGTATTTTT CTGTGTCCGG AGATGCTAAT 6TAAGGATTA GAAATTCTAG GGAAGTGTTA 2220 

TTTGAAAAAG GATATATGAG CGGTGCTAAA GATGTTTCTG AAATGTTCAC TACAAAATTT 2280 

GAGAAAGATA ACTTTTATAT AGAGCTTTCT CAAGGGAATA ATTTATATGG TGGTCCTATT 2340 

GTACATTTTT ACGATGTCTC TATTAAGTAA CCAAG 2375 

(2) INPORMATIOiTfOR SEQ id NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 759 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOIiOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 

Met Asn Lys Asn Asn Thr Lys Leu Ser Ala Arg Ala Leu Pro Ser Phe 
15 10 15 



wo 99/33991 



PCTAJS98/26585 



He Asp Tyr Phe Asn Gly He Tyr Gly Phe Ala Thr Gly He Lys Asp 
20 25 30 

He Met Asn Met He Phe Lys Thr Asp Thr Gly Gly Asn Leu Thr Leu 
35 40 45 

Asp Glu He Leu Lys Asn Gin Gin Leu Leu Asn Glu He Ser Gly Lys 
50 55 60 

Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu He Ala Gin Gly Asn 
65 70 75 80 

-lieu- Asn Thr Glu Leu Ser Lys Glu IlFTeu Lys He Ala Asn Glu Gin 
85 90 _ 95 

Asn Gin Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala He Asn Thr 
100 105 110 

Met Leu Arg He Tyr Leu Pro Lys He Thr Ser Met Leu Ser Asp Val 
115 120 125 

Met Asn Gin Asn Tyr Ala Leu Ser Leu Gin He Glu Tyr Leu Ser Lys 
130 135 140 

Gin Leu Gin Glu He Ser Asp Lys Leu Asp He He Asn Val Asn Val 
145 150 155 160 

Leu He Asn Ser Thr Leu Thr Glu He Thr Pro Ala Tyr Gin Arg He 
165 170 175 

Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Ala Thr Glu Thr 
180 185 190 

Xaa Ser Lys Val Lys Lys Asp Gly Ser Pro Ala Asp He Leu Asp Glu 
195 200 205 

Leu Thr Glu Leu Thr Glu Leu Ala Lys Ser Val Thr Lys Asn Asp Val 
210 215 220 

Asp Gly Phe Glu He Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 
225 230 235 240 

Asn Asn Leu He Gly Arg Ser Ala Leu Lys Thr Ala Ser Glu Leu He 
245 250 255 

Xaa Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 
260 265 270 

Asn Phe Leu He Val Leu Thr Ala Leu Gin Ala Lys Ala Phe Leu Thr 
275 280 285 



Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Ala Asp He Asp Tyr Thr 
290 295 300 
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Ser lie Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 
305 310 315 320 

Asn lie Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 
325 330 335 

Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met lie Val Glu Ala Lys 
340 345 350 

Pro Gly Tyr Ala Leu Val Gly Phe Glu Met Ser Asn Asp Ser lie Thr 
355 360 365 

Val Leu Lys"Val Tyr Glu Ala Lys Leu Lys Gin Asn Tyr Gin Val Asp 
370 375 380 

Lys Asp Ser Leu Ser Glu Val lie Tyr Gly Asp Thr Asp Lys Leu Leu 
385 390 395 400 

Cys Pro Asp Gin Ser Glu Gin lie Tyr Tyr Thr Asn Asn lie Val Phe 
405 410 415 

Pro Asn Glu Tyr Val lie Thr Lys lie Asp Phe Thr Lys Lys Met Lys 
420 425 430 

Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 
435 440 445 

Glu lie Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 
450 455 460 

Arg Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 
465 470 475 480 

lie Ser Glu Thr Phe Leu Thr Pro lie Asn Gly Phe Gly Leu Gin Ala 
485 490 495 

Asp Glu Asn Ser Arg Leu lie Thr Leu Thr Cys Lys Ser Tyr Leu Arg 
500 505 510 

Glu Leu Leu Leu Ala Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu lie 
515 520 525 

Val Pro Pro Ser Gly Phe lie Ser Asn lie Val Glu Asn Gly Ser His 
530 535 540 

Arg Arg Gly Gin Phe Arg Ala Val Glu Ser Lys Glu Cys Val Cys Arg 
545 550 555 560 

Ser Tyr Arg Arg Ser Glu Trp Asn Ser Phe lie Cys Ser Gly Arg Arg 
565 570 575 



Asn Phe Thr lie Tyr Trp Arg Val Lys Thr Glu Asn Val Cys Asn Pro 
580 585 590 
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64 

lie Tyr Cys Arg Lys Thr Phe Tyr Ser Phe Lys Arg Lys Tyr Trp lie 
595 600 605 

Tyr Ser Leu Arg Tyr Lys Phe Lys Arg Leu Ser Asn Tyr Tyr Thr Phe 
610 615 620 

Tyr Tyr Arg Asn Phe Lys Gly Ser Val Phe Asn Phe Lys Lys Ser Lys 
625 630 635 640 

Trp Arg Ser Leu Gly Arg Leu Tyr Tyr Phe Gly Asn Ser Phe Lys Val 
645 650 655 

— lie Lys Ser Arg lie Asn Tyr Lys Leu Asp Glu Tyr Gly He Asji^Ser 
660 665 670 

Tyr Arg Tyr Thr His Ser Leu Ser Gly Arg Thr Arg Asn Ser Lys Thr 
675 680 685 

Lys Pro Ser He Arg Phe Phe Asn Leu Ser Val Phe Phe Cys Val Arg 
690 695 700 

Arg Cys Cys Lys Asp Lys Phe Gly Ser Val He Lys Lys He Tyr Glu 
705 710 715 720 

Arg Cys Arg Cys Phe Asn Val His Tyr Lys He Glu Arg Leu Leu Tyr 
725 730 735 

Arg Ala Phe Ser Arg Glu Phe He Trp Trp Ser Tyr Cys Thr Phe Leu 
740 745 750 

Arg Cys Leu Tyr Val Thr Gin 
755 



(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2376 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

ATGAACAAGA ATAATACTAA ATTAAGCGCA AGAGCCCTAC CGAGTTTTAT TGATTATTTT 60 

AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAATATGAT TTTTAAAACG 120 

GATACAGGTG GTAATCTAAC CTTAGATGAA ATCCTAAAGA ATCAGCAGTT ACTAAATGAG 180 

ATTTCTGGTA AATTGGATGG GGTAAATGGG AGCTTAAATG ATCTTATCGC ACAGGGAAAC 240 

TTAAATACAG AATTATCTAA GGAAATCTTA AAAATTGCAA ATGAACAAAA TCAAGTCTTA 300 
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AATGATGTTA ATAACAAACT CGATGCGATA 
ATTACATCTA TGTTAAGTGA TGTAATGAAC 
TACTTAAGTA AACAATTGCA AGAAATTTCT 
CTTATTAACT CTACACTTAC TGAAATTACA 
GAAAAATTT6 AGGAATTAAC TTTTGCTACA 
TCTCCTGCAG ATATTCTTGA TGAQTTAACT 
AAAAATGATG TGGATGGTTT TGAAATTTAC 
AATAATTTAA TCGGGCGTTC AGCTTTAAAA 
GTGAAAACAA GTGGCA6TGA GGTAGGAAAT 
CTACAAGCAA AAGCTTTTCT TACTTTAACA 
ATTGATTATA CTTCTATTAT GAATGAACAT 
AACATCCTTC CTACACTTTC TAATACTTTT 
AGTGATGAAG ATGCAAAGAT GATTGTGGAA 
GAAATGAGCA ATGATTCAAT CACAGTATTA 
TATCAAGTTG ATAAGGATTC CTTATCGGAG 
TGTCCAGATC AATCTGAACA AATATATTAT 
GTAATTACTA AAATTGATTT CACTAAAAAA 
AATTTTTATG ATTCTTCTAC AGGAGAAATT 
GAAGCGGAGT ATAGAACGTT AAGTGCTAAT 
ATCAGTGAAA CATTTTTGAC TCCGATTAAT 
AGATTAATTA CTTTAACATG TAAATCATAT 
AGCAATAAAG AAACTAAATT GATCGTCCCG 
AACGGGTCCC ATAOAAGAGG ACAATTTAGA 
TGTAGATCAT ACAGGCGGAG TGAATGGAAC 
AATTTCACAA TTTATTGGAG ATAAGTTAAA 
TGTTAAAGGA AAACCTTCTA TTCATTTAAA 
AGATACAAAT AATAATTTAA AAGATTATCA 
TGATTTAAAG GGAGTGTATT TAATTTTAAA 
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AATACX3ATGC TTCGGATATA TCTACCTAAA 360 

CAAAATTATG CGCTAAGTCT GCAAATAGAA 420 

GATAAATTGG ATATTATTAA TGTAAATGTA 480 

CCTGCX3TATC AAAGGATTAA ATATGTGAAC 540 

GAAACTAKTT CAAAAGTAAA T^GGATGGC 600 

GAGTTAACTG AACTAGCGAA AAGTGTAACA 660 

CTTAATACAT TCCACGATGT AATOGTAGGA" 720 

ACTGCATCGG AATTAATTAS TAAAGAAAAT 780 

GTTTATAACT TCTTAATTGT ATTAACAGCT 840 

ACATGCCGAA AATTATTAGG CTTAGCAGAT 900 

TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 960 

TCTAATCCTA ATTATGCAAA AGTTAAAGGA 1020 

GCTAAACCAG GATATGCATT GGTTGGTTTT 1080 

AAAGTATATG AGGCTAAGCT AA/^CAAAAT 1140 

GTTATTTATG GTGATACGGA TAAATTATTG 1200 

ACAAATAACA TAGTATTTCC AAATGAATAT 1260 

ATGAAAACTT TAAGATATGA GGTAACAGCG 1320 

GACTTAAATA AGAAAAAAGT AGAATCAAGT 1380 

GATGATGGAG TGTATATGCC GTTAGGTGTC 1440 

GGGTTTGGCC TCCAAGCTGA TGAAAATTCA 1500 

TTAAGAGAAC TACTGCTAGC AACAGACTTA 1560 

CCAAGTGGTT TTATTAGCAA TATTGTAQAG 1620 

GCCGTGGAAA GCAAATAATA AGAATGCGTA 1680 

TAAAGCTTTA TATGTTCATA AGGACGGAGG 1740 

ACCGAAAACT GAGTATGTAA TCCAATATAC 1800 

AGATGAAAAT ACTGGATATA TTCATTATGA 1860 

AACTATTACT AAACGTTTTA CTACAGGAAC 1920 

AAGTCAAAAT GGAGATGAAG CTTGGGGAQA 1980 
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TAACTTTATT ATTTTGGAAA TTAGTCCTTC TGAAAAGTTA TTAAGTCCAG AATTAATTAA 2040 

TACAAATAAT TG6ACGAGTA CGGGATCAAC TCATATTAGC GGTAATACAC TCACTCTTTA 2100 

TCAGGGAGGA CGAGGAATTC TAAAACAAAA CCTTCAATTA GATAGTTTTT CAACTTATAG 2160 

AGTGTATTTT TCTGTGTCCG GAGATGCTAA TGTAAGGATT AGAAATTCTA GGGAAGTGTT 2220 

ATTTGAAAAA AGATATATGA GCGGTGCTAA AGATGTTTCT GAAATGTTCA CTACAAAATT 2280 

TGAGAAAGAT AACTTTTATA TAGAGCTTTC TCAAGGGAAT AATTTATATG GTGGTCCTAT 2340 

TGTACATTTT TACGATGTCT CTATTAAGTA ACCCAA — 2376 



(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 511 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

Tyr Leu Ser Lys Gin Leu Gin Glu lie Ser Asp Lys Leu Asp lie lie 
15 10 15 

Asn Val Asn Val Leu lie Asn Ser Thr Leu Thr Glu lie Thr Pro Ala 
20 25 30 

Tyr Gin Arg lie Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe 
35 40 45 

Ala Thr Glu Thr Thr Leu Lys Val Lys Lys Asp Ser Ser Pro Ala Asp 
50 55 60 

lie Leu Asp Glu Leu Thr Glu Leu Thr Glu Leu Ala Lys Ser Val Thr 
65 70 75 80 

Lys Asn Asp Val Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp 
85 90 95 

Val Met Val Gly Asn Asn Leu Phe Gly Arg Ser Ala Leu Lys Thr Ala 
100 105 110 

Ser Glu Leu lie Ala Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val 
115 120 125 



Gly Asn Val Tyr Asn Phe Leu lie Val Leu Thr Ala Leu Gin Ala Lys 
130 135 140 



wo 99/33991 PCT/US98/26585 



Ala Phe Leu Thr Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Ala Asp 
145 150 155 160 

lie Asp Tyr Thr Ser lie Met Asn Glu His Leu Asn Lys Glu Lys Glu 
165 170 175 

Glu Phe Arg Val Asn lie Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn 
180 185 190 

Pro Asn Tyr Ala Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met lie 
195 200 205 

Val Glu -A3^ Lys Pro Gly Tyr Ala Leu Val Gly~^fie Glu Met Ser Asn 
2r0 215 220 

Asp Ser lie Thr Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gin Asn 
225 230 235 240 

Tyr Gin Val Asp Lys Asp Pro Leu Ser Glu Val lie Tyr Gly Asp Thr 
245 250 255 

Asp Lys Leu Leu Cys Pro Asp Gin Ser Glu Gin lie Tyr Tyr Thr Asn 
260 265 270 

Asn lie Val Phe Pro Asn Glu Tyr Val lie Thr Lys lie Asp Phe Thr 
275 280 285 

Lys Lys Met Lys Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr Asp 
290 295 300 

Ser Ser Thr Gly Glu lie Asp Leu Asn Lys Lys Lys Val Glu Ser Ser 
305 310 315 320 

Glu Ala Glu Tyr Arg Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met 
325 330 335 

Pro Leu Gly Val lie Ser Glu Thr Phe Leu Thr Pro lie Asn Gly Phe 
340 345 350 

Gly Leu Gin Ala Asp Gly Asn Ser Arg Leu lie Thr Leu Thr Cys Lys 
355 360 365 

Ser Tyr Leu Arg Glu Leu Leu Leu Ala Thr Asp Leu Ser Asn Lys Glu 
370 375 380 

Thr Lys Leu He Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu 
385 390 395 400 

Asn Gly Ser He Glu Glu Asp Asn Leu Glu Pro Trp Lys Ala Asn Asn 
405 410 415 



Lys Asn Ala Tyr Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala 
420 425 430 
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Leu Tyr Val His Lys Asp Gly Gly lie Ser Gin Phe lie Gly Asp Lys 
435 440 445 

Leu Lys Pro Lys Thr Qlu Tyr Val lie Gin Tyr Thr Val Lys Gly Lys 
450 455 460 

Pro Ser lie His Leu Lys Asp Glu Asn Thr Gly Tyr lie His Tyr Glu 
465 470 475 480 

Asp Thr Asn Asn Asn Leu Lys Asp Tyr Gin Thr lie Thr Lys Arg Phe 
485 490 495 

^ Thr Thr Gly Thr Asp Leu Lys Giy^al Tyr Leu lie Leu Lys Ser 

500 505 _ 510 

(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1533 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

TACTTAAGTA AACAATTGCA AGAAATTTCT GATAAATTAG ATATTATTAA CGT7VAATGTT 60 

CTTATTAACT CTACACTTAC TGAAATTACA CCTGCATATC AACGGATTAA ATATGTGAAT 120 

GAAAAATTTG AAGAATTAAC TTTTGCTACA GAAACCACTT TAAAAGTAAA AAAGGATAGC 180 

TCGCCTGCTG ATATTCTTGA TGA6TTAACT GAATTAACTG AACTAGCGAA AAGTGTTACA 240 

AAAAATGACG TTGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 300 

AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCTTCAG AATTAATTGC TAAAGAAAAT 360 

GTGAAAACAA GTGGCAGTGA AGTAGGAAAT GTTTATAATT TCTTAATTGT ATTAACAGCT 420 

CTACAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 480 

ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 540 

AACATCCTYC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 600 

AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GATATGCATT GGTTGGTTTT 660 

GAAATGAGCA ATGATTCAAT CACA6TATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 720 

TATCAAGTTG ATAAGGATCC CTTATCGGAG GTTATTTATG GTGATACGGA TAAATTATTG 780 

TGTCCAGATC AATCTGAACA AATATATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 840 
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GTAATTACTA AAATTGATTT CACTAA7UUA ATGAAAACTT TAAGATATGA GGTAACAGCG 900 

AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 960 

GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGQAG TGTATATGCC ATTAGGTGTC 1020 

ATCAGTGAAA CATTTTTGAC TCCGATTAAT GGGTTTGGCC TCCAAGCTGA TGGAAATTCA 1080 

AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAGAAC TACTGCTAGC AACAGACTTA 114 0 

AGCAATAAAG AAACTAAATT GATCGTCCCG CCAAGTGGTT TTATTAGCAA TATTGTAGAG 1200 

..AAGGGGTCCA TAGAAGAGGA CAATTTAGAG CCGfGGAAAG CAAATAATAA GAATGCGTAT 1260~ 

GTAGATCATA CAGGCQGAGT GAATGGAACT AAAGCTTTAT ATGTTCATAA GGACGGAGGA 1320 

ATTTCACAAT TTATTGGAGA TAAGTTAAAA CCGAAAACTG AGTATGTAAT CCAATATACT 1380 

GTTAAAGGAA AACCTTCTAT TCATTTAAAA GATGAAAATA CTGGATATAT TCATTATGAA 1440 

GATACAAATA ATAATTTAAA AGATTATCAA ACTATTACTA AACGTTTTAC TACAGGAACT 1500 

GATTTAAAGG GAGTGTATTT AATTTTAAAA AGT 1533 



(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 789 amino acids 

(B) TYPE! amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

Met Asn Lys Asn Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe 
1 5 10 15 

ile Asp Tyr Phe Asn Gly lie Tyr Gly Phe Ala Thr Gly lie Lys Asp 
20 25 30 

lie Met Asn Met lie Phe Lys Thr Asp Thr Gly Gly Asp Leu Thr Leu 
35 40 45 

Asp Glu lie Leu Lys Asn Gin Gin Leu Leu Asn Asp Ile Ser Gly Lys 
50 55 60 

Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu lie Ala Gin Gly Asn 
65 70 75 80 

Leu Asn Thr Glu Leu Ser Lys Glu Ile Leu Lys Ile Ala Asn Glu Gin 
85 90 95 
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Asn Gin Val Leu Asn Asp Val Asp Asn Lys Leu Asp Ala lie Asn Thr 
100 105 110 

Met Leu Arg Val Tyr Leu Pro Lys lie Thr Xaa Met Leu Ser Asp Val 
115 120 125 

Met Lys Gin Asn Tyr Ala Leu Ser Leu Gin lie Glu Tyr Leu Ser Lys 
130 135 140 

Gin Leu Gin Glu He Ser Asp Lys Leu Asp lie He Asn Val Asn Val 
145 150 155 160 

Leu He Asn Ser Thr Leu IteTllu He Thr Pro Ala Tyr Gin Arg He — 
— X65 170 _ 175 

Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Ala Thr Glu Thr 
180 185 190 

Ser Ser Lys Val Lys Lys Asp Gly Ser Pro Ala Asp He Leu Asp Glu 
195 200 205 

Leu Thr Glu Leu Thr Glu Leu Ala Lys Ser Val Thr Lys Asn Asp Val 
210 215 220 

Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 
225 230 235 240 

Asn Asn Leu Phe Gly Arg Ser Ala Leu Lys Thr Ala Ser Glu Leu He 

245 250 255 

Thr Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 
260 265 270 

Asn Phe Leu He Val Leu Thr Ala Leu Gin Ala Lys Ala Phe Leu Thr 
275 280 285 

Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Ala Asp He Asp Tyr Thr 
290 295 300 

Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 
305 310 315 320 

Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 
325 330 335 

Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met He Val Glu Ala Lys 
340 345 350 

Pro Gly His Ala Leu Val Gly Phe Glu He Ser Asn Asp Ser He Thr 
355 360 365 



Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gin Asn Tyr Gin Val Asp 
370 375 380 
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Lys Asp Ser Leu Ser Glu Val lie Tyr Gly Asp Met Asp Lys Leu Leu 
385 390 395 400 

Cys Pro Asp Gin Ser Glu Gin lie Tyr Tyr Thr Asn Asn He Val Phe 
405 410 415 

Pro Asn Glu Tyr Val He Thr Lys lie Asp Phe Thr Lys Lys Met Lys 
420 425 430 

Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 
435 440 445 

Glu He Asp Leu Asn Lys Lys~Lys Val Glu Ser Ser Glu Ala Glu Tyr — 
450 455 _ 460 

Arg Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 
465 470 475 480 

He Ser Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Pro Gin Ala 
485 490 495 

Asp Glu Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 
500 505 510 

Lys Leu Leu Leu Ala Thr Asp Phe Ser Asn Lys Glu Thr Lys Leu He 
515 520 525 

Leu Pro Pro Ser Gly Phe He Ser Asn He Val Xaa Asn Gly Ser He 
530 535 540 

Glu Glu Asp Asn Leu Glu Pro Gly Lys Ala Asn Asn Arg Asn Ala Tyr 
545 550 555 560 

Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 
565 570 575 

Lys Asp Gly Gly He Ser Gin Phe He Gly Asp Lys Leu Lys Pro Lys 
580 585 590 

Thr Glu Tyr Val He Gin Tyr Thr Val Lys Gly Lys Pro Ser He His 
595 600 605 

Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 
610 615 620 

Asn Leu Glu Asp Tyr Gin Thr He Thr Lys Arg Phe Thr Thr Gly Thr 
625 630 635 640 

Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gin Asn Gly Asp Glu 
645 650 655 



Ala Trp Gly Asp Asn Phe He He Leu Glu He Ser Pro Ser Glu Lys 
660 665 670 
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Leu Leu Ser Pro Glu Leu lie Asn Thr Asn Asn Trp Thr Ser Thr Gly 
675 680 685 

Ser Thr Asn lie Ser Gly Asn Thr Leu Thr Leu Tyr Gin Gly Gly Arg 
690 695 700 

Gly lie Leu Lys Gin Asn Leu Gin Leu Asp Ser Phe Ser Thr Tyr Arg 
70S 710 715 720 

Val Tyr Phe Ser Val Ser Gly Asp Ala Asn Val Arg lie Arg Asn Ser 

725 730 735 

Afg"^u Val Leu Phe Glu Lys Arg Tyr Met Ser' Gly Ala Lys Asp Val 
740 745 750 

Ser Glu lie Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr lie Glu 
755 760 765 

Leu Ser Gin Gly Asn Asn Leu Asn Gly Gly Pro lie Val His Phe Tyr 
770 775 780 

Asp Val Ser lie Lys 
785 



(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2367 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

ATGAACAAGA ATAATACTAA ATTAAGCACA AGAGCCTTAC CAAGTTTTAT TGATTATTTT 60 

AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAACATGAT TTTTAAAACG 120 

GATACAGGTG GTGATCTAAC CCTAGACGAA ATTTTAAAGA ATCAGCAGTT ACTAAATGAT 180 

ATTTCTGGTA AATTGGATGG GGTGAATGGA AGCTTAAATG ATCTTATCGC ACAGGGAAAC 240 

TTAAATACAG AATTATCTAA AGAAATATTA AAAATTGCAA ATGAACAAAA TCAAGTTTTA 300 

AATGATGTTG ATAACAAACT OGATGCGATA AATAOGATGC TTCGGGTATA TCTACCTAT^ 360 

ATTACCCTAT GTTGAGTGAT GTAATGAAAC AAAATTATGC GCTAAGTCTG CAAATAGAAT 42 0 

ACTTAAGTAA ACAATTGCAA GAGATTTCTG ATAAGTTGGA TATTATTAAT GTAAATGTAC 480 

TTATTAACTC TACACTTACT GAAATTACAC CTGCGTATCA AAGGATTAAA TATGTGAACG 540 
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AAAAATTTGA GGAATTAACT TTTGCTACAG AAACTAGTTC AAAAGTAAAA AAGGATGGCT 600 

CTCCTGCAGA TATTCTTGAT GAGTTAACTG AGTTAACTGA ACTAGCGAAA AGTGTAACAA 660 

AAAATGATGT GGATGGTTTT GAATTTTACC TTAATACATT CCACGATGTA ATGGTAGGAA 720 

ATAATTTATT CGGGCGTTCA GCTTTAAAAA CTGCATCGGA ATTAATTACT AAAGAAAATG 780 
TGAAAACAAG TGGCAGTGAG GTCGGAAATG TTTATAACTT CTTAATTGTA TTAACAGCTC 840 
TGCAAGCAAA AGCTTTTCTT ACTTTAACAA CATGCCGAAA ATTATTAGGC TTAGCAGATA 900 
TTGATTATAC TTCTATTATG AATGAACATT TAAATAAGGA AARAGAGGAA TTTAGAGTAA 960 

ACATCCTCCC TACACTTTCT AATACTTTTT CTAATCCTAA TTATGCAAAA GTTAAAGGAA 1020 

GTGATGAAGA TGCA7UVGATG ATTGTGGAAG CTAAACCAGQ ACATGCATTG GTTGGGTTTG 1080 

AAATTAGT;^ TGATTCAATT ACAGTATTAA AAGTATATGA GGCTAAGCTA AAACAAAATT 1140 

ATCAAGTTGA TAAGGATTCC TTATCGGAAG TTATTTATGG TGATATGGAT AAATTATTGT 1200 

GCCCAGATCA ATCTGAACAA ATCTATTATA CAAATAACAT AGTATTTCCA AATGAATATG 1260 

TAATTACTAA AATTGATTTT ACTAAAAAAA TGAAAACTTT AAGATATGAG GTAACAGCGA 1320 

ATTTTTATGA TTCTTCTACA GGAGAAATTG ACTTAAATAA GAAAAAAGTA GAATCAAGTG 1380 

AAGCGGAGTA TAGAACGTTA AGTGCTAATG ATGATGGAGT GTATATGCCG TTAGGTGTCA 1440 

TCAGTGAAAC ATTTTTGACT CCGATTAATG GGTTTGGCCC CCAAGCTGAT GAAAATTCAA 1500 

GATTAATTAC TTTAACATGT AAATCATATT TAAGAAAACT ACTGCTAGCA ACAGACTTTA 1560 

GCAATAAAGA AACTAAATTG ATCCTCCC6C CAAGTGGTTT TATTAGCAAT ATTGTAGAAA 1620 

CGGGTCCATA GAAGAGGACA ATTTAGAGCC GGGGAAAGCA AATAATAG6A ATGCGTATGT 1680 

AGATCATACA GGCGGAGTGA ATGGAACTAA AGCTTTATAT GTTCATAAGG ACGGAGGAAT 1740 

TTCACAATTT ATTGGAGATA AGTTAAAACC GAAAACTGAG TATGTAATCC AATATACTGT 1800 

TAAAGGAAAA CCTTCTATTC ATTTAAAAGA TGAAAATACT OGATATATTC ATTATQAAGA 1860 

TACAAATAAT AATTTAGAAG ATTATCAAAC TATTACTAAA CGTTTTACTA CAGGAACTGA 1920 

TTTAAAGGGA GTGTATTTAA TTTTAAAAAG TCAAAATGGA GATGAAGCTT GGGGAGATAA 1980 

CTTTATTATT TTGGAAATTA GTCCTTCTGA AAAGTTATTA AGTCCAGAAT TAATTAATAC 2040 

AAATAATTGG ACGAGTACGG GATCAACTAA TATTAGCGGT AATACACTCA CTCTTTATCA 2100 

GGGAGGACGA GGAATTCTAA AACAAAACCT TCAATTAGAT AGTTTTTCAA CTTATAGAGT 2160 

GTATTTTTCT GTGTCCGGAG ATGCTAATGT AAGGATTAGA AATTCTAGGG AAGTGTTATT 2220 
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TGAAAAAAGA TATATGAGCG GTGCTAAAGA TGTTTCTGAA ATTTTCACTA CAAAATTTGA 2280 
GAAAGATAAC TTTTATATAG AGCTTTCTCA AGGGAATAAT TTAAATGGTG GCCCTATTGT 234 0 
ACATTTTTAC GATGTCTCTA TTAAGTA 2367 

(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 789 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single _ — 

(D) TOPOLOGy: linear^^ 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

Met Asn Lys Asn Asn Thr Lys Leu Ser Ala Arg Ala Leu Pro Ser Phe 
15 10 15 

lie Asp Tyr Phe Asn Gly He Tyr Gly Phe Ala Thr Gly He Lys Asp 
20 25 30 

He Met Asn Met He Phe Lys Thr Asp Thr Gly Gly Asn Leu Thr Leu 
35 40 45 

Asp Glu He Leu Lys Asn Gin Gin Leu Leu Asn Glu He Ser Gly Lys 
50 55 60 

Leu Gly Gly Val Asn Gly Ser Leu Asn Asp Leu He Ala Gin Gly Asn 
65 70 75 80 

Leu Asn Thr Glu Leu Ser Lys Glu He Leu Lys He Ala Asn Glu Gin 
85 90 95 

Asn Gin Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala He Asn Thr 
100 105 110 

Met Leu His He Tyr Leu Pro Lys lie Thr Ser Met Leu Ser Asp Val 
115 120 125 

Met Lys Gin Asn Tyr Ala Leu Ser Leu Gin He Glu Tyr Leu Ser Lys 
130 135 140 

Gin Leu Gin Glu He Ser Asp Lys Leu Asp He He Asn Val Asn Val 
145 150 155 160 

Leu He Asn Ser Thr Leu Thr Glu He Thr Pro Ala Tyr Gin Arg He 
165 170 175 

Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Ala Thr Glu Thr 
180 185 190 
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Thr Leu Lys Val Lys Lys Asp Ser Ser Pro Ala Asp lie Leu Asp Glu 
195 200 205 

Leu Thr Glu Leu Thr Glu Leu Ala Lys/Ser Val Thr Lys Asn Asp Val 
210 215 220 

Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 
225 230 235 240 

Asn Asn Leu Phe Gly Arg Ser Ala Leu Lys Thr Ala Ser Glu Leu lie 
245 250 255 

Ala Lys Glu Asn Val Lys Thr Ser Gly^er Glu Val Gly Asn Val Tyr 
260 265 " 270 

Asn Phe Leu lie Val Leu Thr Ala Leu Gin Ala Lys Ala Phe Leu Thr 
275 280 285 

Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Ala Asp lie Asp Tyr Thr 
290 295 300 

Ser lie Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 
305 310 315 320 

Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 
325 330 335 

Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met He Val Glu Ala Lys 

340 345 350 

Pro Gly Tyr Ala Leu Val Gly Phe Glu Met Ser Asn Asp Ser He Thr 
355 360 365 

Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gin Asn Tyr Gin Val Asp 
370 375 380 

Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Thr Asp Lys Leu Leu 
385 390 395 400 

Cys Pro Asp Gin Ser Glu Gin He Tyr Tyr Thr Asn Asn He Val Phe 
405 410 415 

Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 
420 425 430 

Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 
435 440 445 

Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 
450 455 460 



Arg Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 
465 470 475 480 
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He Ser Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gin Ala 
485 490 495 

Asp Gly Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 
500 505 510 

Glu Leu Leu Leu Ala Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 
515 520 525 

Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu Asn Gly Ser He 
530 535 540 

Glu Glu Asp Asn Lett Glu Pro Trp Lys-Ala Asn Asn Lys Asn Ala Tyr 
545 550 555 ^560 

Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 
565 570 575 

Lys Asp Gly Gly He Ser Gin Phe He Gly Asp Lys Leu Lys Pro Lys 
580 585 590 

Thr Glu Tyr Val He Gin Tyr Thr Val Lys Gly Lys Pro Ser He His 
595 600 605 

Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 
610 615 620 

Asn Leu Lys Asp Tyr Gin Thr He Thr Lys Arg Phe Thr Thr Gly Thr 
625 630 635 640 

Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gin Asn Gly Asp Glu 
645 650 655 

Ala Trp Gly Asp Asn Phe He He Leu Glu He Ser Pro Ser Glu Lys 
660 6£5 670 

Leu Leu Ser Pro Glu Leu He Asn Thr Asn Asn Trp Thr Ser Thr Gly 
675 680 685 

Ser Thr His He Ser Gly Asn Thr Leu Thr Leu Tyr Gin Gly Gly Arg 
690 695 _700 

Gly He Leu Lys Gin Asn Leu Gin Leu Asp Ser Phe Ser Thr Tyr Arg 
705 710 715 720 

Val Tyr Phe Ser Val Ser Gly Asp Ala Asn Val Arg He Arg Asn Ser 
725 730 735 

Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Asp Val 
740 745 750 



Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu 
755 760 765 
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Leu Ser Gin Gly Asn Asn Leu Tyr Gly Gly Pro lie Val His Phe Tyr 
770 775 780 

Asp Val Ser lie Lys 
785 

(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2369 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single — 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 

ATGAACAAGA ATAATACTAA ATTAAGCGCA AGGGCCCTAC CGAGTTTTAT TGATTATTTT 60 

AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGTVATATGAT TTTTAAAACG 120 

GATACAGGTG GTAATCTAAC CTTAGATGAA ATCCTAAAGA ATCAGCAGTT ACTAAATGAG 180 

ATTTCTGGTA AATTGGGGGG GGTAAATGGG AGCTTAAATG ATCTTATCGC ACAGGGAAAC 240 

TTT^TACAG AATTATCTAA GGAAATCTTA AAAATTGCAA ATGAACAAAT CAAGTCTTAA 300 

ATGATGTTAA TAACAAACTC GATGCGATAA ATACGATGCT TCATATATAT CTACCTAAAA 360 

TTACATCTAT GTTAAGTGAT GTAATGAAGC AAAATTATGC GCTAAGTCTG CAAATAGAAT 420 

ACTTAAGTAA ACAATTGCAA GAAATTTCTG ATAAATTAGA TATTATTAAC GTAAATGTTC 480 

TTATTAACTC TACACTTACT GAAATTACAC CTGCATATCA ACGGATTAAA TATGTGAATG 540 

AAAAATTTGA AGAATTAACT TTT6CTACAG AAACCACTTT AAAAGTAAAA AAGGATAGCT 600 

CGCCTGCTGA TATTCTTGAT GAGTTAACTG AATTAACTGA ACTAGCGAAA AGTGTTACAA 660 

AAAATGACGT TGATGGTTTT GAATTTTACC TTAATACATT CCACGATGTA ATGGTAGGAA 720 

ATAATTTATT CGGGCGTTCA GCTTTAAAAA CTGCTTCAGA ATTAATTGCT AAAGAAAATG 780 

TGAAAACAAG TGGCAGTGAA GTAGGAAATG TTTATAATTT CTTAATTGTA TTAACAGCTC 840 

TACAAGCAAA AGCTTTTCTT ACTTTAACAA CATGCCGAAA ATTATTAGGC TTAGCAGATA 900 

TTGATTATAC TTCTATTATG AATGAACATT TAAATAAGGA AAAAGAGGAA TTTAGAGTAA 960 

ACATCCTTCC TACACTTTCT AATACTTTTT CTAATCCTAA TTATGCAAAA GTTAAAGGAA 1020 

GTGATGAAGA TGCAAAGATG ATTGTGGAAG CTAAACCAGG ATAT6CATTG GTTGGTTTTG 1080 
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AAATGAGCAA TGATTCAATC ACAGTATTAA AAGTATATGA GGCTAAGCTA AAACJ^AAATT 1140 

ATCAAGTTGA TAAGGATTCC TTATCGGAGG TTATTTATGG TGATACGGAT AAATTATTGT 1200 

GTCCAGATCA ATCTGAACAA ATATATTATA CAAATAACAT AGTATTTCCA AATGAATATG 1260 

TAATTACTAA AATTGATTTC ACTAAAAAAA TGAAAACTTT AAGATATGAG GTAACAGCGA 1320 

ATTTTTATGA TTCTTCTACA GGAGAAATTG ACTTAAATAA GAAAAAAGTA GAATCAAGTG 1380 

AAGCGGAGTA TAGAACGTTA AGTGCTAATG ATGATGGAGT GTATATGCCA TTAGGTGTCA 1440 

TCAGTGAAAC ATTTTTGACT CCGATAAATG ^TTTGGCCT CCAAGCTGAT GGAAATTCAA 1500 

GATTAATTAC TTTAACATGT AAATCATATT TAAGAGAACT ACTGCTAGCA ACAGACTTAA 1560 

GCAATAAAGA AACTAAATTG ATTGTCCCGC CAAGTGGTTT TATTAGCAAT ATTGTAGAGA 1620 

ACGGGTCCAT AGAAGAGGAC AATTTAGAGC CGTGGAAAGC AAATAATAAG AATGCGTATG 1680 

TAGATCATAC AGGCGGAGTG AATGGAACTA AAGCTTTATA TGTTCATAAG GACGGAGGAA 1740 

TTTCACAATT TATTGGAGAT AAGTTAAAAC CGAAAACTGA GTATGTAATC CAATATACTG 1800 

TTAAAGGAAA ACCTTCTATT CATTTAAAAG ATGAAAATAC TGGATATATT CATTATGAAG 1860 

ATACAAATAA TAATTTAAAA GATTATCAAA CTATTACTAA ACGTTTTACT ACAGGAACTG 1920 

ATTTAAAGGG AGTGTATTTA ATTTTAAAAA GTCAAAATGG AGATGAAGCT TGGGGAGATA 1980 

ACTTTATTAT TTTGGAAATT AGTCCTTCTG AAAAGTTATT AAGTCCAGAA TTAATTAATA 2040 

CAAATAATTG GACGAGTACG GGATCAACTC ATATTAGCGG TAATACACTC ACTCTTTATC 2100 

AGGGAGGACG AGGAATTCTA AAACAAAACC TTCAATTAGA TAGTTTTTCA ACTTATAGAG 2160 

TGTATTTTTC TGTGTCCGGA GATGCTAATG TAAGGATTAG AAATTCTAGG GAAGTGTTAT 2220 

TTGAAAAAAG ATATATGAGC GGTGCTAAAG ATGTTTCTGA AATGTTCACT ACAAAATTTG 2280 

AGAAAGATAA CTTTTATATA GAGCTTTCTC AAGGGAATAA TTTATATGGT GGTCCTATTG 2340 

TACATTTTTA CGATGTCTCT ATTAAGTAA 2369 

(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 789 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 

Met Asn Lys Asn Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe 
15 10 15 

He Asp Tyr Phe Asn Gly He Tyr Gly Phe Ala Thr Gly He Lys Asp 
20 25 30 

He Met Asn Met He Phe Lys Thr Asp Thr Gly Gly Asp Leu Thr Leu 
35 40 45 

Asp Glu He Leu Lys Asn Gin Gin Leu Leu Asn Asp He Ser Gly Lys 
50 55 — 60 

Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu He Ala Gin Gly Asn 
65 70 75 80 

Leu Asn Thr Glu Leu Ser Lys Glu He Leu Lys He Ala Asn Glu Gin 
85 90 95 

Asn Gin Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala He Asn Thr 
100 105 110 

Met Leu Arg Val Tyr Leu Pro Lys He Thr Ser Met Leu Ser Asp Val 
115 120 125 

Met Lys Gin Asn Tyr Ala Leu Ser Leu Qln He Glu Tyr Leu Ser Lys 
130 135 140 

Gin Leu Gin Glu He Ser Asp Lys Leu Asp He He Asn Val Asn Val 
145 150 155 160 

Leu He Asn Ser Thr Leu Thr Glu He Thr Pro Ala Tyr Gin Arg He 
165 170 175 

Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Ala Thr Glu Thr 
180 185 190 

Ser Ser Lys Val Lys Lys Asp Gly Ser Pro Ala Asp He Leu Asp Glu 
195 200 205 

Leu Ala Glu Leu Thr Glu Leu Ala Lys Ser Val Thr Lys Asn Asp Val 
210 215 220 

Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 
225 230 235 240 

Asn Asn Leu Phe Gly Arg Ser Ala Leu Lys Thr Ala Ser Glu Leu He 
245 250 255 

Thr Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 
260 265 270 



Asn Phe Leu He Val Leu Thr Ala Leu Gin Ala Lys Ala Phe Leu Thr 
275 280 285 
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Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Ala Asp lie Asp Tyr Thr 
290 295 300 

Ser lie Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 
305 310 315 320 

Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 
325 330 335 

Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met He Val Glu Ala Lys 

340 345 350 

Pro Gly His Ala Leu He Gly Phe-Glu He Ser Asn Asp Ser He Thr 
__355 360 365 

Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gin Asn Tyr Gin Val Asp 
370 375 380 

Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Met Asp Lys Leu Leu 
385 390 395 400 

Cys Pro Asp Gin Ser Glu Gin He Tyr Tyr Thr Asn Ash He Val Phe 
405 410 415 

Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 
420 425 430 

Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 
435 440 445 

Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 
450 455 460 

Arg Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 
465 470 475 480 

He Ser Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gin Ala 
485 490 495 

Asp Glu Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 
500 505 510 

Glu Leu Leu Leu Ala Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 
515 520 525 

Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu Asn Gly Ser He 
530 535 540 

Glu Glu Asp Asn Leu Glu Pro Trp Lys Ala Asn Asn Lys Asn Ala Tyr 
545 550 555 560 



Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 
565 570 575 
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Lys Asp Gly Gly He Ser Gin Phe He Gly Asp Lys Leu Lys Pro Lys 
580 585 590 

Thr Glu Tyr Val He Gin Tyr Thr Val Lys Gly Lys Pro Ser He His 
595 600 605 

Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 
610 615 620 

Asn Leu Glu Asp Tyr Gin Thr He Asn Lys Arg Phe Thr Thr Gly Thr 
625 630 635 640 

Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gin Asn Gly Asp Glu 
645 650 655 

Ala Trp Gly Asp Asn Phe He He Leu Glu He Ser Pro Ser Glu Lys 
660 665 670 

Leu Leu Ser Pro Glu Leu He Asn Thr Asn Asn Trp Thr Ser Thr Gly 
675 680 665 

Ser Thr Asn He Ser Gly Asn Thr Leu Thr Leu Tyr Gin Gly Gly Arg 
690 695 700 

Gly He Leu Lys Gin Asn Leu Gin Leu Asp Ser Phe Ser Thr Tyr Arg 
705 710 715 720 

Val Tyr Phe Ser Val Ser Gly Asp Ala Asn Val Arg He Arg Asn Ser 
725 730 735 

Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Asp Val 
740 745 750 

Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu 
755 760 765 

Leu Ser Gin Gly Asn Asn Leu Tyr Gly Gly Pro He Val His Phe Tyr 
770 775 780 

Asp Val Ser He Lys 
785 



INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2370 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 
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TTGAACAAGA ATAATACTAA ATTAAGCACA AGAGCCTTAC CAAGTTTTAT TGATTATTTT 60 

AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAACATGAT TTTTAAAACG 120 

6ATACAGGTG GTGATCTAAC CCTAGACGAA ATTTTAAAGA ATCAGCAGTT ACTAAATGAT 180 

ATTTCTGGTA AATTGGATGG GGTGAATGGA AGCTTAAATG ATCTTATCGC ACAGGGAAAC 240 

TTAAATACAG AATTATCTAA GGAAATATTA AAAATTGCAA ATGAACAAAA TCAAGTTTTA 300 

AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCGGGTATA TCTACCTAAA 360 

ATTAGGTCTA TGTTGAGTGA TGTAATGAAA CAAAATTATG CGCTAAGTCT GCAAATAGAA 420 

TACTTAAGTA AACAATTGCA AGAGATTTCT GATAAGTTGG ATATTATTAA TGTAAATGTA 480 

CTTATTAACT CTACACTTAC TGAAATTACA CCTGCGTATC AAAGGATTAA ATATGTGAAC 540 

GAAATU^TTTG AGGAATTAAC TTTTGCTACA GAAACTAGTT CAAAAGTAAA AAAGGATGGC 600 

TCTCCTGCAG ATATTCTTGA TGAGTTAGCT GAGTTAACTG AACTAGCGAA AAGTGTAACA 660 

AAAAATGATG TG6ATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 720 

AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCATCGG AATTAATTAC TAAAGAAAAT 780 

GTGAAAACAA GTGGCAGTGA GGTCGGAAAT GTTTATAACT TCTTAATTGT ATTAACAGCT 840 

CTGCAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 900 

ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 960 

AACATCCTCC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 1020 

AGTGATGAAG ATGCAAAQAT GATTGTGGAA GCTAAACCAG GACATGCATT GATTQQGTTT 1080 

GAAATTAGTA ATGATTCAAT TACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 1140 

TATCAAGTCG ATAAGGATTC CTTATCGGAA GTTATTTATG GTGATATGGA TAAATTATTG 1200 

TGCCCAGATC AATCTGAACT^. AATCTATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 1260 

GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAA6ATATGA GGTAACAGCG 1320 

AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 1380 

GT^GCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGGG TGTATATGCC GTTAGGTGTC 1440 

ATCAGTGAAA CATTTTTGAC TCCGATTAAT GGGTTTGGCC TCCAAGCTGA TGAAAATTCA 1500 

AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAGAAC TACTGCTAGC AACAGACTTA 1560 

AGCAATAAAG AAACTAAATT GATTGTCCC6 CCAAGTGGTT TTATTAGCAA TATTGTAGAG 1620 

AACGGGTCCA TAGAAGAGGA CAATTTAGAG CCGTGGAAAG CAAATAATAA GAATGCGTAT 1680 
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GTAGATCATA 
ATTTCACAAT 
GTTAAAGGAA 
GATACAAATA 
GATTTAAAGG 
AACTTTATTA 
ACAAATAATT 
CAGGGAGGAC 
GTGTATTTTT 
TTTGAJi^AAAA 
GAGAAAGATA 
GTACATTTTT 



CAGGCGGAGT 
TTATTGGAGA 
AACCTTCTAT 
ATAATTTAGA 
GA6TGTATTT 
TTTTGGAAAT 
GGACGAGTAC 
GAGGGATTCT 
CTGTGTCCGG 
GATATATGAG 
ACTTTTATAT 
ACGATGTCTC 



GAATGGAACT 
TAAGTTAAAA 
TCATTTAAAA 
AGATTATCAA 
T^TTTTAAAA 
TAGTCCTTCT 
GGdATCAACT 
AAAACAAAAC 
AGATGCTAAT 
CGGTGCTAAA 
AGAGCTTTCT 
TATTAAGTAA 



JOS 

AAAGCTTTAT 
CCGAAAACTG 
GATGAAAATA 
ACTATTAATA 
AGTCAAAATG 
GAAAAGTTAT 
AATATTAOeG 
CTTCAATTAG 
GTAAGGATTA 
GATGTTTCTG 
CAAGGGAATA 



ATGTTCATAA 
AGTATGTAAT 
CTGGATATAT 
AACGTTTTAC 
GAGATGAAGC 
TAAGTCCAGA 
GTAATACACT 
ATAGTTTTTC 
GAAATTCTAG 
AAATGTTCAC 
ATTTATATGG 



GGACGGAGGA 
CCAATATACT 
TCATTATGAA 
TACAGGAACT 
TTGGGGAGAT 
ATTAATTAAT 
CACTCTTTAT 
AACTTATAGA 
GGAAGTGTTA 
TACAAAATTT 
TGGTCCTATT 



1740 

1800 

1860 

1920 

1980 

2040 

2100" 

2160 

2220 

2280 

2340 

2370 



(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 789 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 

Met Asn Lys Asn Asn Thr Lys Leu Ser 
1 5 

He Asp Tyr Phe Asn Gly He Tyr Gly 
20 25 

He Met Asn Met He Phe Lys Thr TVsp 
35 40 

Asp Glu He Leu Lys Asn Gin Gin Leu 
50 55 

Leu Asp Gly Val Asn Gly Ser Leu Asn 
65 70 

Leu Asn Thr Glu Leu Ser Lys Glu He 
85 



i96: 

Thr Arg Ala Leu Pro 
10 



Ser Phe 
15 



Phe Ala Thr Gly He Lys Asp 
30 

Thr Gly Gly Asp Leu Thr Leu 
45 

Leu Asn Asp He Ser Gly Lys 
60 

Asp Leu He Ala Gin Gly Asn 
75 80 

Leu Lys He Ala Asn Glu Gin 
90 95 
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1% 

Asn Gin Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala He Asn Thr 
100 105 110 

Met Leu Arg Val Tyr Leu Pro Lys He Thr Ser Met Leu Ser Asp Val 
115 120 125 

Met Lys Gin Asn Tyr Ala Leu Ser Leu Gin He Glu Tyr Leu Ser Lys 
130 135 140 

Gin Leu Gin Glu He Ser Asp Lys Leu Asp He He Asn Val Asn Val 
145 150 155 160 

Leu He Asn Ser'T^hr Leu Thr Glu-Iie Thr Pro Ala Tyr Gin Arg He 
165 170 "175 



Lys Tyr Val Asn Glu Lys Phe Glu 
180 

Ser Ser Lys Val Lys Lys Asp Gly 

195 200 

Leu Thr Glu Leu Thr Glu Leu Ala 
210 215 

Asp Gly Phe Glu Phe Tyr Leu Asn 
225 230 

Asn Asn Leu Phe Gly Arg Ser Ala 
245 

Thr Lys Glu Asn Val Lys Thr Ser 

260 

Asn Phe Leu He Val Leu Thr Ala 
275 280 



Glu Leu Thr Phe Ala Thr Glu Thr 
185 190 

Ser Pro Ala Asp He Leu Asp Glu 
205 

Lys Ser Val Thr Lys Asn Asp Val 
220 

Thr Phe His Asp Val Met Val Gly 
235 240 

Leu Lys Thr Ala Ser Glu Leu He 
250 255 

Gly Ser Glu Val Gly Asn Val Tyr 
265 270 

Leu Gin Ala Lys Ala Phe Leu Thr 
285 



Leu Thr Thr Cys Arg Lys Leu Leu 
290 295 

Ser He Met Asn Glu His Leu Asn 
305 310 

Asn He Leu Pro Thr Leu Ser Asn 
325 

Lys Val Lys Gly Ser Asp Glu Asp 
340 

Pro Gly His Ala Leu He Gly Phe 
355 360 

Val Leu Lys Val Tyr Glu Ala Lys 
370 375 



Gly Leu Ala Asp He Asp Tyr Thr 
300 

Lys Glu Lys Glu Glu Phe Arg Val 
315 320 

Thr Phe Ser Asn Pro Asn Tyr Ala 
330 335 

Ala Lys Met He Val Glu Ala Lys 
345 350 

Glu He Ser Asn Asp Ser He Thr 
365 

Leu Lys Gin Asn Tyr Gin Val Asp 
380 
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Lys Asp Ser Leu Ser Glu Val lie Tyr Gly Asp Met Asp Lys Leu Leu 
385 390 395 400 

Cys Pro Asp Gin Ser Glu Gin lie Tyr Tyr Thr Asn Asn He Val Phe 
405 410 415 

Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 
420 425 430 

Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 
435 440 445 

Glu He Asp Leu Asn Lys JLys-^sn Val Glu Ser Ser Glu Ala Glu "fyf 
450 455 460 

Arg Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 
465 470 475 480 

He Ser Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gin Ala 
485 490 495 

Asp Glu Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 
500 505 510 

Glu Leu Leu Leu Ala Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 
515 520 525 

Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu Asn Gly Ser He 
530 535 540 

Glu Glu Asp Asn Leu Glu Pro Trp Lys Ala Asn Asn Lys Asn Ala Tyr 
545 550 555 560 

Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 
565 570 575 



Lys Asp Gly Gly He Ser Gin Phe He Gly Asp Lys Leu Lys Pro Lys 
580 585 590 

Thr Glu Tyr Val He Gin Tyr Thr Val Lys Gly Lys Pro Ser He His 
595 600 605 

Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 
610 615 620 

Asn Leu Glu Asp Tyr Gin Thr He Asn Lys Arg Phe Thr Thr Gly Thr 
625 630 635 640 

Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gin Asn Gly Asp Glu 
645 650 655 



Ala Trp Gly Asp Asn Phe He He Leu Glu He Ser Pro Ser Glu Lys 
660 665 670 
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Lot 

Leu Leu Ser Pro Glu Leu lie Asn Thr Asn Asn Trp Thr Ser Thr Gly 
675 680 685 

Ser Thr Asn lie Ser Gly Asn Thr Leu Thr Leu Tyr Gin Gly Gly Arg 
690 695 700 

Gly lie Leu Lys Gin Asn Leu Gin Leu Asp Ser Phe Ser Thr Tyr Arg 
705 710 715 720 

Val Tyr Phe Ser Val Ser Gly Asp Ala Asn Val Arg lie Arg Asn Ser 

725 730 735 

Arg Glu Val Leu Phe Gtcrtiys Arg Tyr Met Ser Gly Ala Lys Asp Vai- 

745 _ 750 

Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr lie Glu 
755 760 765 

Leu Ser Gin Gly Asn Asn Leu Tyr Gly Gly Pro He Val His Phe Tyr 
770 775 780 

Asp Val Ser He Lys 
785 



(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2374 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

ATGAACAAGA ATAATACTAA ATTAAGCACA AGAGCCTTAC CAAGTTTTAT TGATTATTTT - 60 

AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAACATGAT TTTTAAAACG 120 

GATSCAGGTG GTGATCTAAC CCTAGACGAA ATTTTAAAGA ATCAGCAGTT ACTAAATGAT 180 

ATTTCTGGTA AATTGGATGG GGTGAATGGA AGCTTAAATG ATCTTATCGC ACAGGGAAAC 240 

TTAAATACAG AATTATCTAA GGAAATATTA AAAATTGCAA ATGAACAAAA TCAAGTTTTA 300 

AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCGGGTATA TCTACCTAAA 360 

ATTACCTCTA TGTTGAGTGA TGTAATGAAA CAAAATTATG CGCTAAQTCT GCAAATAGAA 420 

TACTTAAGTA AACAATTGCA AGAGATTTCT GATAAGTTGG ATATTATTAA TGTAAATGTA 480 

CTTATTAACT CTACACTTAC TGAAATTACA CCTGCGTATC AAAG6ATTAA ATATGTGAAC 540 
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GAAAAATTTG AGGAATTAAC TTTTGCTACA GAAACTAGTT CAAAAGTAAA AAAGGATGGC 600 

TCTCCTGCAG ATATTCTTGA TGAGTTAACT GAGTTAACTG AACTAGCGAA AAGTGTAACA 660 

AAAAATGATG TGGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 720 

AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCATCGG AATTAATTAC TAAAGAAAAT 780 

GTGAAAACAA GTGGCAGTGA GGTCGGAAAT GTTTATAACT TCTTAATTGT ATTAACAGCT 84 0 

CTGCAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 900 

ATTGATTATA CTTCTATTAT GAATGAAGAT TTAAATAAGG AAT^GAGGA ATTTAGAGTA 960 

AACATCCTCC CTACACTTTC TAATACTTTT tCTAATCCTA ATTATGCAAA AGTTAAAGGA 1020 

AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GACATGCATT GATTGGGTTT 1080 

GAAATTAGTA ATGATTCAAT TACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 1140 

TATCAAGTCG ATAAGGATTC CTTATCGGAA GTTATTTATG GTGATATGGA TAAATTATTG 1200 

TGCCCAGATC AATCTGAACA AATCTATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 1260 

GTAATTACTA AAATTGATTT CACTAAA/^ ATGAAAACTT TAAGATATGA GGTAACAGCG 1320 

AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAACGT CGAATCAAGT 1380 

GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGGG TGTATATGCC GTTAGGTGTC 1440 

ATCAGTGAAA CATTTTTGAC TCCGATTAAT GGGTTTGGCC TCCAAGCTGA TGAAAATTCA 1500 

AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAGAAC TACTGCTAGC AACAGACTTA 1560 

AGCAATAAAG AAACTAAATT GATGTCCCGC CAAGTGGTTT TATTAGCAAT ATTGTAGAGA 1620 

ACGGGTCCAT AGAAGAGGAC AATTTAGAGC CGTGGAAAGC AAATAATAAG AATGCGTATG 1680 

TAGATCATAC AGGCGGAGTG AATGGAACTA AAGCTTTATA TGTTCATAAG GACGGAGGAA 1740 

TTTCACAATT TATTGGAGAT AAGTTAAAAC CGAAAACTGA GTATGTAATC C7ATATACTG 1800 

TTAAAGGAAA ACCTTCTATT CATTTAAAAG ATGAAAATAC TGGATATATT CATTATGAAG 1860 

ATACAAATAA TAATTTAGAA GATTATCAAA CTATTAATAA ACGTTTTACT ACAGGAACTG 192 0 

ATTTAAAGGG AGTGTATTTA ATTTTAAAAA GTCAAAATGG AGATGAAGCT TGGGGAGATA 1980 

ACTTTATTAT TTTGGAAATT AGTCCTTCTG AAAAGTTATT AAGTCCAGAA TTAATTAATA 2040 

CAAATAATTG GACGAGTACG GGATCTU^CTA ATATTAGCGG TAATACACTC ACTCTTTATC 2100 

AGGGAGGACG AGGGATTCTA AAACAAAACC TTCAATTAGA TAGTTTTTCA ACTTATAGAG 2160 

TGTATTTTTC TGTGTCCGGA GATGCTAATG TAAGGATTAG AAATTCTAGG GAAGTGTTAT 2220 
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110 

TTGAAAAAAG ATATATGAGC GGTGCTAAAG ATGTTTCTGA AATGTTCACT ACAAAATTTG 2280 

AGAAAGATAA CTTTTATATA GAGCTTTCTC AAGGGAATAA TTTATATGGT GGTCCTATTG 2340 

TACATTTTTA CGATGTCTCT ATTAAGTAAC CCAA 2374 



(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 789 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear ' 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

Met Asn Lys Asn Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe 
15 10 15 

He Asp Tyr Phe Asn Gly He Tyr Gly Phe Ala Thr Gly He Lys Asp 
20 25 30 

He Met Asn Met He Phe Lys Thr Asp Thr Gly Gly Asn Leu Thr Leu 
35 40 45 

Asp Glu He Leu Lys Asn Gin Gin Leu Leu Asn Glu He Ser Gly Lys 
50 55 60 

Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu He Ala Gin Gly Asn 
65 70 75 80 

Leu Asn Thr Glu Leu Ser Lys Glu He Leu Lys He Ala Asn Glu Gin 
85 90 95 

Asn Gin Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala He Asn Thr 
100 105 110 

Met Leu His He Tyr Leu Pro Lys He Thr Ser Met Leu Ser Asp Val 
115 120 125 

Met Lys Gin Asn Tyr Ala Leu Ser Leu Gin He Glu Tyr Leu Ser Lys 
130 135 140 

Gin Leu Xaa Glu He Ser Asp Lys Leu Asp He He Asn Val Asn Val 
145 150 155 160 

Leu He Asn Ser Thr Leu Thr Glu He Thr Pro Ala Tyr Gin Arg He 
165 170 175 



Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Ala Thr Glu Thr 
180 185 190 



wo 99/33991 



PCTAJS98/26585 



Thr Leu Lys Val Lys Lys Asp Ser Ser Pro Ala Asp lie Leu Asp Glu 
195 200 205 

Leu Thr Glu Leu Thr Glu Leu Ala Lys Ser Val Thr Lys Asn Asp Val 
210 215 220 

Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 
225 230 235 240 

Asn Asn Leu Phe Gly Arg Ser Ala Leu Lys Thr Ala Ser Glu Leu lie 

245 250 255 

Ala Lys Glu Asn Val Lys Thr Ser Gly Ssx^^Glu Val Gly Asn Val Tyr 
—^260 265" 270 

Asn Phe Leu lie Val Leu Thr Ala Leu Gin Ala Lys Ala Phe Leu Thr 
275 280 285 

Leu Thr Thr Cy^ Xaa Lys Leu Leu Gly Leu Ala Asn lie Asp Tyr Thr 
290 295 300 

Ser lie Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 
305 310 315 320 

Asn lie Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 
325 330 335 

Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met lie Val Glu Ala Lys 
340 345 350 

Pro Gly Tyr Ala Leu Val Gly Phe Glu Met Ser Asn Asp Ser lie Thr 
355 360 365 

Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gin Asn Tyr Gin Val Asp 
370 375 380 

Lys Asp Ser Leu Ser Glu Val lie Tyr Gly Asp Thr Asp Lys Leu Leu 
385 390 395 400 

Cys Pro Asp Gin Ser Glu Gin lie Tyr Tyr Thr Asn Asn lie Val Phe 
405 410 415 

Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 
420 425 430 

Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 
435 440 445 

Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 
450 455 460 



Arg Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 
465 470 475 480 
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lids 

lie Ser Glu Thr Phe Leu Thr Xaa He Xaa Gly Phe Gly Leu Gin Ala 
485 490 495 

Asp Gly Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 
500 505 510 

Glu Leu Leu Leu Ala Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 
515 520 525 

Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu Asn Gly Ser He 
530 535 540 

Glu- Glu Asp Asn Leu Glu Pro Trp LyH~Ala Asn Asn Lys Asn Ala Tyr 
545 550 555 _ 560 

Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 
565 570 575 

Lys Asp Gly Gly Phe Ser Gin Phe He Gly Asp Xaa Leu Lys Pro Lys 
580 585 590 

Thr Glu Tyr Xaa He Gin Tyr Thr Val Lys Gly Lys Pro Ser He His 
595 600 605 

Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 
610 615 620 

Asn Leu Lys Asp Tyr Gin Thr He Thr Lys Arg Phe Thr Thr Gly Thr 
625 630 635 640 

Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gin Asn Gly Asp Glu 
645 650 655 

Ala Trp Gly Asp Asn Phe He He Leu Glu He Ser Pro Ser Glu Lys 
660 665 670 

Leu Leu Ser Pro Glu Leu He Asn Thr Asn Asn Trp Thr Ser Thr Gly 
675 680 685 

Ser Thr His He Ser Gly Asn Thr Leu Thr Leu Tyr Gin Gly Gly Arg 
690 695 700 

Gly He Leu Lys Gin Asn Leu Gin Leu Asp Ser Phe Ser Thr Tyr Arg 
705 710 715 720 

Val Tyr Phe Ser Val Ser Gly Asp Ala Asn Val Arg He Arg Asn Ser 
725 730 735 

Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Asp Val 
740 745 750 



Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu 
755 760 765 
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113 

Leu Ser Gin Gly Asn Asn Leu Tyr Gly Gly Pro He Val His Phe Tyr 
770 775 780 

Asp Val Ser He Lys 
785 

(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2366 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single — 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 

ATGAACAAGA ATAATACTAA ATTAAGCACA AGAGCCTTAC CGAGTTTTAT TGATTATTTT 60 

AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAATATGAT TTTTAAAACG 120 

GATACAGGTG GTAATCTAAC CTTAGATGAA ATCCTAAAGA ATCAGCAGTT ACTAAATGAG 180 

ATTTCTGGTA AATTGGATGG GGTAAATGGG AGCTTAAATG ATCTTATCGC ACAGGGAAAC 240 

TTAAATACAG AATTATCTAA GGAAATCTTA AAAATTGCAA ATGAACAGAA TCAAGTCTTA 300 

AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCATATATA TCTACCTAAA 360 

ATTACATCTA TGTTAAGTGA TGTAATGAAG CAAAATTATG CGCTAAGTCT GCAAATAGAA 420 

TACTTAAGTA AACAATTGCA GAATTTCTGA TAAATTAGAT ATTATTAACG TAAATGTTCT 480 

TATTAACTCT ACACTTACTG AAATTACACC TGCATATCAA CGGATTAAAT ATGTGAAGAA 540 

AAATTTGAAG AATTAACTTT TGCTACAGAA ACCACTTTAA AAGTAAAAAA GGATAGCTCG 600 

CCTGCTGATA TTCTTGATGA GTTAACTGAA TTAACTGAAC TAGCGAAAAG TGTTACAAAA 660 

AATGACGTTG ATGGTTTTGA ATTTTACCTT AATACATTCC ACGATGTAAT GGTAGGAAAT 720 

AATTTATTCG GGCGTTCAGC TTTAAAAACT 6CTTCAGAAT TAATTGCTAA AGAAAATGTG 780 

AAAACAAGTG GCAGTGAAGT AGGAAATGTT TATAATTTCT TAATTGTATT AACAGCTCTA 840 

CAAGCAAAAG I CTTTTCTTAC TTTAACAACA TGCCAAAATT ATTAGGCTTA GCAAATATTG 900 

ATTATACTTC TATTATGAAT GAACATTTAA ATAAGGAAAA AGAGGAATTT AGAGTAAACA 960 

TCCTTCCTAC ACTTTCTAAT ACTTTTTCTA ATCCTAATTA TGCAAAAGTT AAAGGAAGTG 1020 

ATGAAGATGC AAAGATGATT GTGGAAGCTA AACCAGGATA TGCATTGGTT GGTTTTGAAA 1080 
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TGAGCAATGA TTCAATCACA GTATTAAAAG TATATGAGGC TAAGCTAAAA CAAAATTATC 1140 

AAGTTGATAA GGATTCCTTA TCGGAGGTTA TTTATGGTGA TACGGATAAA TTATTGTGTC 1200 

CAGATCAATC TGAACAAATA TATTATACAA ATAACATAGT ATTTCCAAAT GAATATGTAA 1260 

TTACTAAAAT TGATTTCACT AAAAAAATGA AAACTTTAAG ATATGAGGTA ACAGCGAATT 1320 

TTTATGATTC TTCTACAGGA GAAATTGACT TAAATAAGAA AAAAGTAGAA TCAAGTGT^G 1380 

CGGAGTATAG AACGTTAAGT 6CTAATGATG ATGGAGTGTA TATGCCATTA GGTGTCATCA 1440 

GTGftftACATT TTTGACTCGA TTATGGGTTT GGCCTCCAAG CTGATGGAAA TTCAAGATTA^ 1500 

ATTACTTTAA CATGTAAATC ATATTTAAGA GAACTACTGC TAGCAACAGA CTTAAGCAAT 1560 

AAAGAAACTA AATTGATTGT CCCCCAAGTG GTTTTATTAG CAATATTGTA GAGAACGGGT 1620 

CCATAGAAGA GGACAATTTA GAGCCGTGGA AAGCAAATAA TAAGAATGCG TATGTAGATC 1680 

ATACAGGCGG AGTGAATGGA ACTAAAGCTT TATATGTTCA TAAGGACGGA GGATTTTCAC 1740 

AATTTATTGG AGATAATTAA AACCGAAAAC TGAGTATTAA TCCAATATAC TGTTAAAGGA 1800 

AAACCTTCTA TTCATTTAAA AGATGAAAAT ACTGGATATA TTCATTATGA AGATACAAAT 1860 

AATAATTTAA AAGATTATCA AACTATTACT AAACGTTTTA CTACAGGAAC TGATTTAAAG 1920 

GGAGTGTATT TAATTTTAAA AAGTCAAAAT GGAGATGAAG CTTGGGGAGA TAACTTTATT 1980 

ATTTTGGAAA TTAGTCCTTC TGAAAAGTTA TTAAGTCCAG AATTAATTAA TACAAATAAT 2040 

TGGACGAGTA CGGGATCAAC TCATATTAGC GGTAATACAC TCACTCTTTA TCAGGGAGGA 2100 

CGAG6AATTC TAAAACAAAA CCTTCAATTA GATAGTTTTT CAACTTATAG AGTGTATTTT 2160 

TCTGTGTCCG GAGATGCTAA TGTAAGGATT AGAAATTCTA GGGAAGTGTT ATTTGAAAAA 2220 

AGATATATQA GCGGTGCTAA AGATGTTTCT GAAATGTTCA CTACAAAATT TGAGAAAGAT 2280 

AACTTTTATA TAGAGCTTTC TCAAGGGAAT AATTTATATG GTGGTCCTAT TGTACATTTT 2340 

TACGATGTCT CTATTAAGTA ACCCAA 2366 

(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 789 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 

Met Asn Lys Asn Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe 
1 5 , 10 15 

lie Asp Tyr Phe Asn Gly lie Tyr Gly Phe Ala Thr Gly lie Lys Asp 
20 25 30 

lie Met Asn Met lie Phe Lys Thr Asp Thr Gly Gly Asp Leu Thr Leu 
35 40 45 

Asp Qlu lie Leu Lys Asn Gin Gin Leu Leu Asn Asp lie Ser Gly Lys 
50 55 " 60 _ — — 

Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu lie Ala Gin Gly Asn 
65 70 75 80 

Leu Asn Thr Glu Leu Ser Lys Glu lie Leu Lys lie Ala Asn Glu Gin 
85 90 95 

Asn Gin Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala lie Asn Thr 
100 105 110 

Met Leu J^g Val Tyr Leu Pro Lys He Thr Phe Met Leu Ser Asp Val 
115 120 125 

Met Lys Gin Asn Tyr Ala Leu Ser Leu Gin He Glu Tyr Leu Ser Lys 
130 135 140 

Gin Leu Gin Glu He Ser Asp Lys Leu Asp He He Asn Val Asn Val 
145 150 155 160 

Leu He Asn Ser Thr Leu Thr Glu He Thr Pro Ala Tyr Gin Arg He 
165 170 175 

Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Ala Thr Glu Thr 
180 185 190 

Ser Ser Lys Val Lys Lys Asp Gly Ser Pro Ala Asp He Leu Asp Glu 
195 200 205 

Leu Thr Glu Leu Thr Glu Leu Ala Lys Ser Val Thr Lys Asn Asp~Val 
210 215 220 

Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 
225 230 235 240 

Asn Asn Leu Phe Gly Arg Ser Ala Leu Lys Thr Ala Ser Glu Leu He 
245 250 255 

Thr Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 
260 265 270 



Asn Phe Leu He Val Leu Thr Ala Leu Gin Ala Lys Ala Phe Leu Thr 
275 280 285 



) 
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Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Ala Asp lie Asp Tyr Thr 
290 295 300 

Ser lie Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 
305 310 315 320 

Asn lie Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 
325 330 335 

Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met lie Val Glu Ala Lys 
340 345 350 

Pro Gly His Ala Leu He Gly Phe Glu He Ser Asn Asp'Ser He Thr _ 
355 360 365 

Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gin Asn Tyr Gin Val Asp 
370 375 380 

Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Met Asp Lys Leu Leu 
385 390 395 400 

Cys Pro Asp Gin Ser Glu Gin He Tyr Tyr Thr Asn Asn He Val Phe 
405 410 415 

Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 
420 425 430 

Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 
435 440 445 

Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 
450 455 460 

Arg Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 
465 470 475 480 

He Ser Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gin Ala 
485 490 495 

Asp Glu Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 
500 505 510 

Glu Leu Leu Leu Ala Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 
515 520 525 

Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu Asn Gly Ser He 
530 535 540 

Glu Glu Asp Asn Leu Glu Pro Trp Lys Ala Xaa Asn Xaa Asn Ala Tyr 
545 550 555 560 



Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 
565 570 575 
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Lys Asp Gly Gly lie Ser Gin Phe lie Gly Asp Lys Leu Lys Pro Lys 
580 585 590 

Thr Glu Tyr Val lie Gin Tyr Thr Val Lys Gly Lys Pro Ser lie His 
595 600 605 

Leu Lys Asp Glu Asn Thr Gly Tyr lie His Tyr Glu Asp Thr Asn Asn 
610 615 620 

Asn Leu Xaa Xaa Tyr Gin Thr lie Asn Lys Arg Phe Thr Thr Gly Thr 
625 630 635 640 

Asp Leu Lys Gly Val Tyr Leu lie Eeu Lys Ser Gin Jlsn-Gly Xaa Glu 
645 650 655 

Ala Trp Gly Asp Asn Phe lie He Leu Glu He Ser Pro Ser Glu Lys 
660 665 670 

Leu Leu Ser Pro Xaa Leu He Asn Thr Xaa Asn Trp Thr Ser Thr Gly 
675 680 685 

Ser Thr Asn He Ser Gly Asn Thr Leu Thr Leu Tyr Gin Gly Gly Arg 
690 695 700 

Gly He Leu Lys Gin Asn Leu Gin Leu Asp Ser Phe Xaa Thr Tyr Arg 
705 710 715 720 

Val Tyr Phe Ser Val Ser Gly Asp Ala Asn Val Arg He Arg Asn Ser 
725 730 735 

Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Xaa Val 
740 745 750 

Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu 
755 760 765 

Leu Ser Gin Gly Asn Asn Leu Tyr Gly Gly Pro He Val His Phe Tyr 
770 775 780 

Asp Val Ser He Lys 
785 



INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2362 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 
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ATGAACAAGA ATAATACTAA ATTAAGCACA AGAGCCTTAC CAAGTTTTAT TGATTATTTT 60 
AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAACATGAT TTTTAAAACG 120 

QATACAGGTG GTGATCTAAC CCTAGACGAA ATTTTAAAGA ATCAGCAGTT ACTAAATGAT 180 

ATTTCTGGTA AATTGGATGG GGTGAATGGA AGCTTAAATG ATCTTATCX5C ACAGGGAAAC 240 

TTAAATACAG AATTATCTAA GGAAATATTA AAAATTGCAA ATGAACAAAA TCAAGTTTTA 300 

AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCGGGTATA TCTACCTAAA 360 

ATTACCTTTA TGTTGAGTGA TGTAATGAAA CAZQUVTTATG CGCTAAGICT- GCAAATAGAA 420 

TACTTAAGTA AACAATTGCA AGAGATTTCT GATAAGTTGG ATATTATTAA TGTAAATGTA 480 

CTTATTAACT CTACACTTAC TGAAATTACA CCTGCGTATC AAAGGATTAA ATATGTGAAC 540 

GAAAAATTTG AGGAATTAAC TTTTGCTACA GAAACTAGTT CAAAAGTAAA AAAGGATGGC 600 

TCTCCTGCAG ATATTCTTGA TGAGTTAACT GAGTTAACTG AACTAGCGAA AAGTGTAACA 660 

AAAAATGATG TGGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 720 

AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCATCGG AATTAATTAC TAAAGAAAAT 780 

GTGAAAACAA GTGGCAGTGA GGTCGGAAAT GTTTATAACT TCTTAATTGT ATTAACAGCT 840 

CTGCAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG GTTAGCAGAT 900 

ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 960 

AACATCCTCC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 1020 

AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GACATGCATT GATTGGGTTT 1080 

GAAATTAGTA ATGATTCAAT TACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 1140 

TATCAAGTCG ATAAGGATTC CTTATCGGAA GTTATTTATG GTGATATGGA TAAATTATTG 1200 

TGCCCAGATC AATCTGAACA AATCTATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 1260 

GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 1320 

AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 1380 

GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGGG TGTATATGCC GTTAGGTGTC 1440 

ATCAGTGAAA CATTTTTGAC TCCGATTAAT GGGTTTGGCT CCAAGCTGAT GAAAATTCAA 1500 

GATTAATTAC TTTAACATGT AAATCATATT TAAGAGAACT ACTGCTAGCA ACAGACTTAA 1560 

GCAATAAAGA AACTAAATTG ATCGTCCCGC CAAGT6GTTT TATTAGCAAT ATTGTAGAGA 1620 

ACGGGTCCAT AGAAGAGGAC AATTTAGAGC CCTGGAAAGC AATAATAGAA TGCGTATGTA 1680 
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GATCATACAG GCGGAGTGAA TGGAACTAAA GCTTTATATG 
TCACAATTTA TTGGAGATAA GTTAAAACCG AAAACTGAGT 
AAAGGAAAAC CTTCTATTCA TTTAAAAGAT GAAAATACTG 
ACAAATAATA ATTTAAATTA TCAAACTATT AATAAACGTT 
AAGGGAGTGT ATTTAATTTT AAAAAGTCAA AATGGAATGA 
TTATTTTGGA AATTAGTCCT TCTGAAAAGT TATTAAGTCC 
TGGACAGTAC GGGATCAACT AATATTAGCG "GTAATACACT 
GAGGGATTCT AA7WVCAAAAC CTTCAATTAG ATAGTTTTCA 
TGTGTCCGGA GATGCTAATG TAAGGATTAG AAATTCTAGG 
ATATATGAGC GGTGCTAAAA TGTTTCTGAA ATGTTCACAC 
TTTATATA6A GCTTTCTCAA GGGAATAATT TATATGGTGG 
ATGTCTCTAT TAAGTAACCC AA 



TTCATAAGGA CGGAGGAATT 174 0 

ATGTAATCCA ATATACTGTT 1800 

GATATATTCA TTATGAAGAT 1860 

TTACTACAGG AACTGATTTA 1920 

AGCTTGGGGA GATAACTTTA 1980 

AAATTAATTA ATACAATAAT 2040 

CACTCTXTAT CAGGGAGGAC 2100 

ACTTATAGAG TGTATTTTTC 2160 

GAAGTGTTAT TTGAAAA7VAG 2220 

AAAATTTGAG AAAGATAACT 2280 

TCCTATTGTA CATTTTTACG 2340 

2362 



(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 790 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 



Met His Glu Asn Asn Thr Lys Leu Ser Ala 
15 10 



Arg Ala Leu Pro Ser Phe 
15 



lie Asp Tyr Phe Asn Gly lie Tyr Gly Phe 
20 25 



Ala Thr Gly lie Lys Asp 

30~^ 



lie Met Asn Met lie Phe Lys Thr Asp Thr 
35 40 



Gly Gly Asn Leu Thr Leu 
45 



Asp Glu lie Leu Lys Asn Gin Gin Leu Leu 
50 55 



Asn Glu lie Ser Gly Lys 
60 



Leu Asp Gly Val Asn Gly Ser Leu Asn Asp 
65 70 



Leu He Ala Gin Gly Asn 
75 80 



Leu Asn Thr Glu Leu Ser Lys Glu He Leu 
85 90 



Lys He Ala Asn Glu Gin 
95 
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17.0 

Ser Gin Val Leu Aen Asp Val Asn Asn Lys Leu Asp Ala He Asn Thr 
100 105 110 

Met Leu His He Tyr Leu Pro Lys He Thr Ser Met Leu Ser Asp Val 
115 120 125 

Met Lys Gin Asn Tyr Ala Leu Ser Leu Gin He Glu Tyr Leu Ser Lys 
130 135 140 

Gin Leu Gin Glu He Ser Asp Lys Leu Asp He He Asn Val Asn Val 
145 150 155 160 

Leu He Asn Ser Thr Leu Thr Glu He Thr EiO-Ala Tyr Gin Arg He 
. 165 170 175 

Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Ala Thr Glu Thr 
180 185 190 

Thr Leu Lys Val Lys Lys Asp Xaa Ser Pro Ala Asp He Leu Asp Glu 
195 200 205 

Leu Thr Glu Leu Thr Glu Leu Ala Lys Ser Val Thr Lys Asn Asp Val 
210 215 220 

Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 
225 230 235 240 

Asn Asn Leu Phe Gly Arg Ser Ala Leu Lys Thr Ala Ser Glu Leu He 
245 250 255 

Ala Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 
260 265 270 

Asn Phe Lieu He Val Leu Thr Ala Leu Gin Ala Lys Ala Phe Leu Thr 
275 280 285 

Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Ala Asp He Asp Tyr Thr 
290 295 300 

Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 
305 310 315 320 

Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 
325 330 335 

Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met He Val Glu Ala Lys 
340 345 350 

Pro Gly Tyr Ala Leu Val Gly Phe Glu Met Ser Asn Asp Ser He Thr 
355 360 365 



Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gin Asn Tyr Gin Val Asp 
370 375 380 
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IN 

Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Thr Asp Lys Leu Leu 
385 390 395 400 

Cys Pro Asp Gin Ser Glu Gin He Tyr Tyr Thr Asn Asn He Val Phe 
405 410 415 

Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 
420 425 430 

Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 
435 440 445 

Glu Ile--ABP Leu Asn Lys Lys Lys Val Glu Ser Ser-Glu Ala Glu Tyr 
" 450 ^.'^SS 460 

Arg Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 
465 470 475 480 

He Ser Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gin Ala 
485 490 495 

Asp Gly Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 
500 505 510 

Lys Leu Leu Leu Ala Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 
515 520 525 

Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu Asn Gly Ser He 
530 535 540 

Glu Glu Asp Asn Leu Glu Pro Trp Lys Ala Asn Asn Lys Asn Ala Tyr 
545 550 555 560 

Val Asp His Thr Gly Gly Val Lys Gly Thr Lys Ala Leu Tyr Val His 
565 570 575 

Lys Asp Gly Gly He Ser Gin Phe He Gly Asp Xaa Leu Lys Pro Lys 
580 585 590 

Thr Glu Tyr Val He Gin Tyr Thr Val Lys Gly Lys Pro Ser He His 
595 600 605 

Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 
610 615 620 

Asn Leu Lys Asp Tyr Gin Thr He Thr Lys Arg Phe Thr Thr Gly Thr 
625 630 635 640 

Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gin Asn Gly Asp Glu 
645 650 655 

Ala Trp Gly Asp Asn Phe He He Leu Glu He Ser Pro Ser Glu Lys 
660 665 670 
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Leu Leu Ser Pro Glu Leu He Asn Thr Asn Asn Trp Thr Ser Thr Gly 
675 680 685 

Ser Thr His lie Ser Gly Asn Thr Leu Thr Leu Tyr Gin Gly Gly Arg 
690 695 700 

Gly He Leu Lys Gin Asn Leu Gin Leu Asp Ser Phe Ser Thr Tyr Arg 
705 710 715 720 

Val Tyr Phe Ser Val Ser Gly Asp Ala Asn Val Arg He Arg Asn Ser 
725 730 735 

Arg Glu Val Leu Phe Glu Lys Arg Tyr -Met Ser Gly Ala Lys Asp Val 
_740 745 750 

Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu 
755 760 765 

Leu Ser Gin Gly Asn Asn Leu Tyr Gly Gly Pro He Val His Phe Tyr 
770 775 780 

Asp Val Xaa He Lys Pro 
785 790 



(2) INFORMATION FOR SEQ ID NO: 103:' 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2375 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 



ATGCACGAGA ATAATACTAA ATTAAGCGCA AGGGCCTTAC 


CGAGTTTTAT 


TGATTATTTT 


60 


AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA 


TGAATATGAT 


TTTTAAAACG 


120 


GATACAGGTG GTAATCTAAC CTTAGATGAA ATCCTAAAGA 


ATCAGCAGTT 


ACTAAATGAG 


180 


ATTTCTGGTA AATTGGATGG GGTAAATGGG AGCTTAAATG ATCTTATCGC 


ACAGGGAAAC 


240 


TTAAATACAQ AATTATCTAA GGAAATCTTA AAAATTGCAA ATGAACAGAG 


TCAAGTTTTA 


300 


AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC 


TTCATATATA 


TCTACCTAAA 


360 


ATTACATCTA TGTTAAGTGA TGTAATGAAG CAAAATTATG 


CGCTAAGTCT 


GCAAATAGAA 


420 


TACTTAAGTA AACAATTGCA AGAAATTTCT GATAAATTAG 


ATATTATTAA 


CGTAAATGTT 


480 


CTTATTAACT CTACACTTAC TGAAATTACA CCTGCATATC 


AAC6GATTAA 


ATATGTGAAT 


540 
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GAAAAATTTG AAGAATTAAC TTTTGCTACA GAAACCACTT TAAAAGTAAA AAAGGATRAC 600 
TCGCCTGCTG ATATTCTTGA TGAATTAACT GAATTAACTG AACTAGCGAA AAGTGTTACA 660 
TU^AAATGACG TTGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 720 
AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCTTCAG AATTAATTGC TAAAGAAAAT 780 
GTGAAAACAA GTGGCAQTGA AGTAGGAAAT GTTTATAATT TCTTAATTGT ATTAACAGCT 840 
CTACAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 900 
ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 960 

AACATCCTTC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 1020 

AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GATATGCATT GGTTGGTTTT 1080 

GAAATGAGCA ATGATTCAAT CACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 1140 

TATCAAGTTG ATAAGGATTC CTTATCGGAG GTTATTTATG GTGATACGGA TAAATTATTG 1200 

TGTCCAGATC AATCTGAACA AATATATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 1260 

GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 1320 

AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 1380 

GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGAG TGTATATGCC ATTAGGTGTC 1440 

ATCAGTGAAA CATTTTTGAC TCCGATAAAT GGGTTTGGCC TCCAAGCTGA TGGAAATTCA 1500 

AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAAAAC TACTGCTAGC AACAGACTTA 1560 

AGCAATAAAG AAACTAAATT GATCGTCCC?G CCAAGTGGTT TTATTAGCAA TATTGTAGAG 1620 

AACGGGTCCA TAGAAGAGGA CAATTTAGAG CCGTGGAAAG CAAATAATAA GAATGCGTAT 1680 

GTAGATCA^A CAGGCGGAGT GAAAGGAACT AAAGCTTTAT ATGTTCATAA GGACGGAGGA 1740 

ATTTCACAAT TTATTGGAGA TAAKTTAAAA CCGAAAACTG AGTATGTAAT CCAATATACT 1800 

GTTAAAGGAA AACCTTCTAT TCATTTAAAA 6ATQAAAATA CTGGATATAT TCATTATGAA 1860 

GATACAAATA ATAATTTAAA AGATTATCAA ACTATTACTA AACGTTTTAC TACAGGAACT 1920 

GATTTAAAGG GAGTGTATTT AATTTTAAAA AGTCAAAATG GAGATGAAGC TTGGGGAGAT 1980 

AACTTTATTA TTTTGGAAAT TAGTCCTTCT GAAAAGTTAT TAAGTCCAGA ATTAATTAAT 2040 

ACAAATAATT GGACGAGTAC GGGATCAACT CATATTAGCG GTAATACACT CACTCTTTAT 2100 

CAGGGAGGAC GAGGAATTCT AAAACAAAAC CTTCAATTAG ATAGTTTTTC AACTTATAGA 2160 

GTGTATTTTT CTGTGTCCGG AGATGCTAAT GTAAGGATTA GAAATTCTAG GGAAGTGTTA 2220 
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TTTGAAAA;^ GATATATGAG CGGTGCTTVAA GATGTTTCTG AAATGTTCAC TACAAAATTT 2280 
GAGAAAGATA ACTTTTATAT AGAGCTTTCT CAAGGGAATA ATTTATATGG TGGTCCTATT 2340 
GTGCATTTTT ACGATGTCYC TATTAAGTAA CCCAA 2375 

(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 554 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single" ^ 

_XD) TOPOLOGYT'linear — 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 

Thr Leu His Leu Leu Lys Leu His Leu Arg He Lys Gly Leu Asn Met 
15 10 15 

Thr Lys Asn Leu Arg Asn Leu Leu Leu Xaa Xaa Leu Xaa Gin Lys Lys 
20 25 30 

Arg Net Ala Leu Leu Gin He Phe Xaa Met Ser Leu Ser Xaa Asn Arg 
35 40 45 

Lys Val Gin Lys Met Met Trp Met Val Leu Asn Phe Thr Leu He His 
50 55 60 

Ser Thr Met Xaa Glu He He Tyr Ser Gly Val Gin Leu Lys Leu Xaa 
65 70 75 80 

Arg Asn Leu Leu Lys Lys Met Lys Gin Val Ala Val Xaa Xaa Glu Met 
65 90 95 

Phe He Xaa Ser Leu Tyr Gin Leu Xaa Lys Gin Lys lieu Phe Leu Leu 
100 105 110 

Gin His Ala Glu Asn Tyr Xaa Gin He Leu He He Leu Leu Leu Met 
115 120 125 ' 

Asn He He Arg Lys Lys Arg Asn Leu Glu Thr Ser Xaa Leu His Phe 
130 135 140 

Leu He Leu Phe Leu He Leu He Met Gin Lys Leu Lys Glu Val Met 
145 150 155 160 

Lys Met Gin Arg Leu Trp Lys Leu Asn Gin Asp Met His Trp Leu Val 
165 170 175 

Leu Lys Ala Met He Gin Ser Gin Tyr Lys Tyr Met Arg Leu Ser Asn 
180 185 190 
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Lys He He Lys Leu He Arg He Pro Tyr Arg Arg Leu Phe Met Val 
195 200 205 

He Arg He Asn Tyr Cys Val Gin He Asn Leu Asn Lys Tyr He He 
210 215 220 

Gin He Thr Tyr Phe Gin Met Asn Met Leu Leu Lys Leu He Ser Leu 
225 230 235 240 

Lys Lys Lys Leu Asp Met Arg Gin Arg He Phe Met He Leu Leu Gin 
245 250 255 

Glu Lys Leu Thr He-Arg Lys Lys AsnJSln Val Lys Arg Ser He Glu 
260 265 270 

Arg Val Leu Met Met Met Xaa Cys He Cys His Val Ser Ser Val Lys 
275 280 285 

His Phe Leu Arg Met Gly Leu Ala Ser Lys Leu Arg Gin He Gin Asp 
290 295 300 

Leu Leu His Val Asn His He Glu Asn Tyr Cys Gin Gin Thr Ala He 
305 310 315 320 

Arg Lys Leu Asn Ser Ser Arg Gin Val Phe Tyr Gin Tyr Cys Arg Glu 
325 330 335 

Arg Val Leu Arg Arg Gly Gin Phe Arg Ala Val Glu Ser Lys Glu Cys 
340 345 350 

Val Cys Arg Ser Tyr Arg Arg Ser Glu Trp Asn Ser Phe He Cys Ser 
355 360 365 

Gly Arg Arg Asn Phe Thr He Tyr Trp Arg Val Lys Thr Glu Asn Val 
370 375 380 

Cys Asn Pro He Tyr Cys Arg Lys Thr Phe Tyr Ser Phe Lys Arg Lys 
385 390 395 400 

Tyr Trp He Tyr Ser Leu Arg Tyr Lys Phe Lys Arg Leu Ser Asn Tyr 
405 410 415 

Tyr Thr Phe Tyr Tyr Arg Asn Phe Lys Gly Ser Val Phe Asn Phe Lys 
420 425 430 

Lys Ser Lys Trp Arg Ser Leu Gly Arg Leu Tyr Tyr Phe Gly Asn Ser 
435 440 445 

Phe Lys Val He Lys Ser Arg He Asn Tyr Lys Leu Asp Glu Tyr Gly 
450 455 460 

He Asn Ser Tyr Arg Tyr Thr His Ser Leu Ser Gly Arg Thr Arg Asn 
465 470 475 480 
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Ser Lys Thr Lys Pro Ser He Arg Phe Phe Asn Leu Ser Val Phe Phe 
4S5 490 495 

Cys Val Arg Arg Cys Cys Lys Asp Lys Phe Qly Ser Val He Lys Lys 
500 505 510 

He Tyr Glu Arg Cys Arg Cys Phe Asn Val His Tyr Lys He Glu Arg 
515 520 525 

Leu Leu Tyr Arg Ala Phe Ser Arg Glu Phe He Trp Trp Ser Tyr Cys 
530 535 540 

Thr Phe Leu Arg Cys lieu Tyr Val Thr_Gln 



(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1888 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

ACTCTACACT TACTGAAATT ACACCTGCGT ATCAAAGGAT TAAATATGTG AACGAAAAAT 60 

TTGAGGAATT AACTTTTGCT ACRGAMACTA ECTTCAAAAGT AAAAAMGGAT GGCTCTCCTS 120 

CAGATATTCT KGATGAGTTA ACTGAGTTAA CWGAACTAGC GAAAAGTGTA ACAAAAAATG 180 

ATGTGGATGG TTTTRAATTT TACCTTAATA CATTCCACGA TGTAT^GGTA GGAAATAATT 240 

TATTCGGGCG TTCAGCTTTA AAAACTGCWT CGGAATTAAT TRCTAAAGAA AATGTGAAAA 3 00 

CAAGTGGCAG TGARGTMGGA AATGTTTATA AYTTCTTAAT TGTATTAACA GCTCTRCAAG 360 

CAAAAGCTTT TCTTACTTTA ACAACATGCC GAAAATTATT AGGSTTAGCA GATATTGATT 420 

ATACTTCTAT TATGAATGAA CATTTAAATA AGGAAAAAGA GGAATTTAGA GTAAACATCC 480 

TYCCTACACT TTCTAATACT TTTTCTAATC CTAATTATGC AAAAGTTAAA GGAAGTGATG 54 0 

AAGATGCAAA GATGATTGTG GAAGCTAAAC CAGGATATGC ATTGGTTGGT TTTGAAATGA 600 

GCAATGATTC AATCACAGTA TTAAAAGTAT ATGAGGCTAA GCTAAAACAA AATTATCAAG 660 

TTGATAAGGA TTCCTTATC6 GAGGTTATTT ATGGTGATAC GGATA/JITTA TTGTGTCCAG 720 

ATCAATCTGA ACAAATATAT TATACAAATA ACATAGTATT TCCAAATGAA TATGTAATTA 780 

CTAAAATTGA TTTCACTAAA AAAATGAAAA CTTTAAGATA TGAGGTAACA GCGAATTTTT 840 



wo 99/33991 



PCTAJS98/26585 



ATGATTCTTC 



TACAGGAGAA ATTGACTTAA 



ATAA6AAAAA 



AGTAGAATCA AGTGAAGCGG 



900 



AGTATAGAAC 



GTTAAGTGCT AATGATGATG 



GRGTGTATAT 



GCCATTAGGT GTCATCAGTG 



960 



AAACATTTTT GACTCCGATA AATGGGTTTG GCCTCCAAQC TGAGGCAAAT TCAAGATTAA 1020 

TTACTTTAAC ATGTAAATCA TATTTAAGAG AACTACTGCT AGCAACAGAC TTAAGCAATW 1080 

AGGAAACTAA ATTGATCTTC CCX3CCAAGTG TTTTATTAGC AATATTGTAG AGAACGGGTC 1140 

CTTAGAAGAG GACAATTTAG AGCCGTGGAA AGCAAATAAT AAGAATGCGT ATGTAGATCA 1200 

TACAGGCGGA GTGAATQGAA CTAAAGCTTT ATATGTTCAT AAGGACGGAG GAATTTCACA 1260 

ATTTATTGGA GATAAGTTAA AACCGAAAAC TGAGTATGTA ATCCAATATA CTGTTAAAGG 1320 

AAAACCTTCT ATTCATTTAA AAGATGAAAA TACTGGATAT ATTCATTATG AAGATACAAA 1380 

TAATAATTTA AAAGATTATC AT^CTATTAC TAAACGTTTT ACTACAGGAA CTGATTTAAA 1440 

GGGAGTGTAT TTAATTTTAA AAAGTCAAAA TGGAGATGAA GCTTGGGGAG ATAACTTTAT 1500 

TATTTTGGAA ATTAGTCCTT CTGAAAAGTT ATTAAGTCCA GAATTAATTA ATACAAATAA 1560 

TTGGACGAGT ACGGGATCAA CTCATATTAG CGGTAATACA CTCACTCTTT ATCAGGGAGG 1620 

ACGAGGAATT CTAAAACAAA ACCTTCAATT AGATAGTTTT TCAACTTATA GAGTGTATTT 1680 

TTCTGTGTCC GGAGATGCTA ATGTAAGGAT TAGAAATTCT AGGGAAGTGT TATTTGAAAA 1740 

AAGATATATG AGCGGTGCTA AAGATGTTTC TGAAATCTTC ACTACAAAAT TTGAGAAAGA 1800 

TAACTTTTAT ATAGAGCTTT CTCAAGGGAA TAATTTATAT GGTGGTCCTA TTGTACATTT 1860 

TTACGAT6TC TCTATTAAGT AACCCAAA 1888 
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